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FOREWORD 



The Data Audit and Analysis Toolkit is intended to help those responsible for 
planning and implementing programs focused on the first college year to better 
understand the student experience during this critical period. The idea for the 
Toolkit grew out of our strong conviction that colleges and universities in the country 
typically “don’t know what they know” about the first year of college. Most insti- 
tutions have a lot of data about first-year students. But these data are frequently 
collected by different offices for different purposes and are not usually harnessed by 
faculty and staff to paint a comprehensive picture of what is happening to first-year 
students. The Toolkit provides a way to begin exploiting these hidden information 
resources to enhance both experiences and outcomes for students in their first 
college year. 

While the notion of a “toolkit” may at first seem mundane, we view this effort 
in the light of a larger vision provided by Russell Edgerton, Director of the Pew 
Forum on Undergraduate Learning. When Russ was leading the education grant- 
making program at The Pew Charitable Trusts, he inspired and funded a 
remarkable array of improvement initiatives for undergraduate education. Some 
of these, like John Gardner’s work in the Policy Center on the First Year of 
College, were intended to directly improve institutional practices. Working in 
Russ’ words “from the inside out,” they were designed to change the way colleges 
and universities do business by applying the best of what we know about what 
helps students learn and succeed. Others, like Peter Ewell’s work at the National 
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Center for Higher Education Management Systems (NCHEMS) on accreditation 
and public accountability, were intended to shape the broader conditions within 
which higher education institutions do their work. Operating “from the outside 
in,” they were designed to change public conversations about “quality” in higher 
education, and to create and align external incentives for institutions to act 
deliberately to improve undergraduate education. Running through both was the 
common theme of taking active, collective responsibility for student learning and 
success. The Toolkit is but one of many initiatives advanced in this spirit by the 
Policy Center on the First Year of College — which is itself one of some forty 
individual projects that are now members of the Pew Forum on Undergraduate 
Learning. Though the language of the Toolkit is of data elements and analysis, a 
common vision of success and improvement inspired its creation and should 
remain foremost in our minds. 

The specific idea for the Toolkit came up in a speech Peter delivered at John’s 
invitation to the National Forum on Assessment of the First College Year, held at 
the University of South Carolina in February 2000. Peter’s central theme in this 
talk was that college officials usually have only limited understanding of the “lived 
experience” of first-year college students — the often highly personal events and 
milestones that may make the difference between leaving an institution and 
sticking it out. With better understanding, educators could establish better 
policies, build better programs, and make better decisions. A second key point 
Peter made was how different and complex these “lived experiences” turn out to 
be. Behind the “averages” of most statistics are myriad real individuals — who 
come to college with different expectations and abilities, and who interact with the 
institution in distinctive ways. The same program may thus have very different 
effects on different kinds of students, and we establish “generic” programs at our 
peril. 

Understanding the diverse experiences of students in their first college year 
demands better information than most institutions can currently lay their hands 
on. A good first step is to identify, inventory, and round up the data that your insti- 
tution already has about first-year students. Capitalizing on NCHEMS’ experience 
in conducting “data audits” of this kind, we enlisted the help of ten pilot institutions 
to help us try out the concepts embodied in the Toolkit. Karen Paulson of NCHEMS 
took the lead in drafting the document and incorporating the lessons learned from 
the pilot institutions. Mike Siegel of the Policy Center did yeoman service in 
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recruiting pilot schools and in coordinating the review and implementation 
process. Based on the experiences of these pilot participants, institutions can 
benefit significantly from taking stock of their existing information resources on 
the first year of college. Any strategy for improvement, though, should utilize 
multiple measures in addition to the student-record information that the data audit 
will reveal. Prominent candidates for such additional measures are two data- 
collection approaches also underwritten by Pew — the National Survey of Student 
Engagement (NSSE) and the joint Policy Center and UCLA Higher Education 
Research Institute’s survey. Your First College Year. But whatever the approach 
taken, institutions should be as proactive and creative as they can be in seeking 
multiple sources of information about how students experience and negotiate their 
critical initial encounter with college. 

The information that results from this exercise has many uses. Most important, of 
course, better understanding can lead to program improvement. Specific Imowledge 
of what works, for whom, and under what circumstances can help those responsible 
for first-year programs to design better interventions and experiences, tailored 
particularly to the needs and characteristics of different kinds of students. The 
same kind of information can help educators evaluate the effectiveness of these 
interventions and, if they are proven effective, can help them argue for continued 
funding in those tight budget years that seem to be all too common these days. 
Building the databases needed to understand the first year of college also positions 
institutions to gradually extend the coverage of their information resources to 
address the entire undergraduate experience. Concentrating initially on information 
to improve first-year success can thus address a prominent problem faced by many 
colleges and universities while it simultaneously provides the foundation for a 
more comprehensive campus assessment effort. 

But most important of all as you begin to use this Toolkit is to remember the 
original vision: increasing the success and academic performance of the diverse 
array of students who attend our many institutions. TTiey and the public depend 
on us to provide the effective academic programs and support services that can 
help them fulfill their rich and unique potentials. 



Peter Ewell and John Gardner 



ACKNOWLEDGEMENTS 



The National Center for Higher Education Management Systems wishes to thank 
the Policy Center on the First Year of College, located at Brevard College (NC), 
for their collaboration on the First Year Data Audit and Analysis Toolkit Project. 
We would also like to acknowledge The Pew Charitable Trusts and the Atlantic 
Philanthropies for their generous support of the project. 

Ten institutions participated in the pilot study of the Toolkit. Individuals from 
Augustana College (IL), The University of Minnesota-Duluth, Ohio University, 
Northeastern State Technical and Community College (TN), The University of 
Texas-El Paso, University of Cincinnati, Lynchburg College (VA), Blue Ridge 
Community College (VA), Santa Fe Community College (FL), and Washington 
State University gave countless hours to implementing data audits on each of their 
campuses. They read and gave useful comments on draft versions of the Toolkit 
and kept track of the opportunities and pitfalls they encountered while conducting 
their data audits. As a result of their input and comments, this Toolkit is a much 
stronger and more understandable document. 

The Policy Center on the First Year of College partners, John Gardner, Betsy 
Barefoot, Randy Swing, Marc Cutright, and Mike Siegel, helped shape the Toolkit 
through their comments on draft versions, discussions of potential areas for 
strengthening or trimming, and their support for the project all along. Special 



DATA AUDIT AND ANALYSIS TOOLKIT 



recognition must be given to Dr. Michael Siegel at the Policy Center on the First 
Year of College. During the Toolkit’s development, he collaborated at every stage: 
editing early drafts, presenting conference sessions about the Toolkit, answering 
questions about data audits, and supporting it through to completion. Thanks, 
Mike, you’re the best! 

NCHEMS colleagues Peter Ewell, John Clark, Linda Keep, Patrick Kelly, Clara 
Roberts, and Paula Schild read drafts of the Toolkit and gave invaluable editorial 
suggestions. 



xii 



17 



INTRODUCTION 



Why a Data Audit? 

The basic objective of a data audit is to identify and inventory data sources and 
needs across the campus. Information derived from the audit can then be used to 
design and create a flexible analytical database suited to conducting a range of 
analyses about the first year of college on an on-demand basis. Put simply: a data 
audit allows an institution to periodically and systematically take stock of, and 
then mobilize, its data resources. All colleges and universities should consider con- 
ducting a data audit with regard to the first year of college in order to accurately 
assess the implementation and impact of the first year on students, faculty, and 
staff. If an institution chooses, data audits can be expanded to include the entire 
institution and data about students at all levels. 

A fundamental shift of perspective is required to assess the implementation 
and impact of the first year. Determining “what happened” and “what mattered” 
during that year involves moving from a cross-sectional to a longitudinal 
perspective. Data contained in live transactional databases such as admissions or 
registration systems, by their very nature, change every day. Therefore, using such 
data directly to examine students and their behavior analytically has many draw- 
backs. Instead we need to capture “snapshots” — that is, freeze the data, containing 
carefully defined subsets of these data at periodic intervals and archive them for 
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later analysis. These subsets of data can be used in combination to provide a 
model of student movement through the curriculum and institution. Determining 
which particular data elements to capture in this manner — and where they can be 
found — is a primary objective of the data audit. Often data are found organized in 
databases by type of data or survey or survey administration. What we really want 
for analysis, though, are data organized by student — analogous to a transcript that 
assembles data about what happens to them over time. Data of this kind enable us 
to investigate the first-year student experience to examine such items as patterns 
of retention and interrupted enrollment, the order in which courses are taken and 
completed (or dropped), and any association between academic success and 
participating in particular kinds of programs or interventions. 

Conducting a data audit and creating a database, however, are not ends in 
themselves but activities in an ongoing process, designed to enable campuses to 
more effectively understand and improve the experiences of their students in the 
first year of college. Examining patterns of student behavior and the effectiveness 
of first-year programs, therefore, is as much a matter of attitude as it is of 
technique. A key point here is simply the commitment to improve. Institutional 
commitment, supplemented with the flexibility and latitude to make changes in 
first-year programs and activities, will make a difference to students. Individuals 
involved in first-year-of-college programs should be continually encouraged to 
ask empirical questions about performance and effectiveness, and to back up their 
opinions and anecdotes with facts. It is appropriate to ask: “Is this an empirical 
question that can actually be answered and supported with some data?” 

This Toolkit is based on the premise that it is important to conduct a data audit 
and data analyses on the entire first year of college. This requires bringing together 
data already gathered and used, as well as data that are collected and unused, to get 
a holistic understanding of the first year of college, rather than focusing on separate 
activities, experiences, and classes. The Pew Charitable Trusts and The Atlantic 
Philanthropies generously supported the Policy Center on the First Year of 
College and the National Center for Higher Education Management Systems as 
they developed the documents and conducted the pilot study for the Toolkit. A call 
for participation in the pilot study yielded nineteen applications. From these, staff 
chose ten institutions to represent a range of institutions: Augustana College (IL), 



INTRODUCTION 



The University of Minnesota-Duluth, Ohio University, Northeastern State 
Technical and Community College (TN), The University of Texas-El Paso, 
University of Cincinnati, Lynchburg College (VA), Blue Ridge Community 
College (VA), Santa Fe Community College (FL), and Washington State 
University. Input from this diverse set of institutions has strengthened the Toolkit 
and made it more applicable in a variety of settings. 

The Administrative Rationale of the First Year Data Audit Toolkit is designed 
for use by academic affairs or administrative affairs administrators in order to 
build an argument for conducting a data audit on campus. The Administrative 
Rationale begins with a general overview explaining the importance of a data audit 
focused on the first year of college. Its next section briefly outlines how to foster a 
culture of evidence on campus and some tips for creating a “data-based dialogue” 
with various campus constituencies, followed by an outline of what is involved in 
conducting a data audit. The companion Technical Manual is for both adminis- 
trators who want to know more in-depth information about data analyses. In addition 
to providing a rationale for the data audit, the Technical Manual also includes a set 
of recommendations for a “common core” of data elements that institutions should 
consider assembling and maintaining in order to conduct analyses of the first 
year of college. This section is followed by a short discussion of the construction 
of longitudinal student databases. Finally, the Technical Manual concludes with 
the kinds of data analyses that might be used to illustrate what is happening in the 
first year of college, and a range of standard reporting templates are provided as 
an associated appendix. 



CREATING A CULTURE OF 
DATA USE 



Conducting a data audit and creating a database for analysis are not ends in 
themselves but activities in an ongoing process designed to enable campuses to 
more effectively understand and improve the experiences of their students in the first 
year of college. Examining patterns of student behavior and the effectiveness of 
first-year programs, therefore, is as much a matter of attitude as it is of technique. 
A key point here is simply the desire to improve— and the flexibility and latitude to 
make the kinds of changes in programs and activities that will make a difference. 
Campus leaders need to: 

• Foster this attitude continually, 

• Allow people “in the trenches” the discretion to change what they do, 

• Encourage active and ongoing participation of the faculty, staff, and students 
from across the institution, 

• Encourage as much use of public records and open access as is possible, 
given confidentiality guidelines, 

• Support institutional faculty, staff, and administrators with data and 
appropriate resources, and 

• Visibly celebrate their efforts and successes. 
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A second key point is to remember that the “right” things to do during the first 
year of college are not just matters of opinion and debate, but can be investigated 
concretely with real data. As a result, all people involved in first-year-of-college 
programs should be continually encouraged to ask empirical questions about 
performance and effectiveness, and to back up their opinions and anecdotes with 
facts. In other words, whenever somebody is tempted to assert, “X is happening” 
or that “Y is the case,” (s)he should always pause to consider, “Is this an empirical 
question that can actually be answered and supported with some data?” 

When seeking to build a culture of data use on campus, it is also important to 
bear in mind the many different ways in which people use information. Most 
researchers or institutional analysts tend to adopt the rational perspective on data 
use, which assumes that those individuals running programs want information to 
make decisions. And, indeed, that is often the case. Real decisions must be made in 
first-year programs about such matters as whether to continue with particular 
program components, how much to invest in various activities, and how to establish 
priorities for serving specific types of students. It is equally important, though, to be 
aware that information serves a variety of other functions in any organizational 
setting. Among the most prominent of these are: 

• Problem Identification . Sometimes data are useful simply to signal the fact 
that a problem exists that needs to be further investigated. In this regard, 
establishing statistical indicators may well be a profitable course of action. 
Monitoring indicators over time (e.g., annually, from term to term, etc.) can 
reveal the extent to which progress is being made in improving performance, 
or it can chart important changes in student behaviors or conditions. Graphic 
or visual displays of such information are often useful for problem 
identification because they can quickly be scanned for anomalies. For the 
first year of college, for example, useful statistical indicators might include: 

/ First-to-second-term persistence (reenrollment) rate, 

/ Fall-to-fall reenrollment rate, 

/ Percent of first-year students in academic difficulty, 

/ Percent of first-year students requiring and completing developmental 
work in basic skills areas (reading, writing, math). 
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CREATING A CULTURE OF DATA USE 



/ Number of violations of established policies for placing students into 
courses or prerequisite course sequences, 

/ Student/faculty and student/ad visor ratios, or 
/ Percentage of courses dropped. 

• Context Setting . Another prominent use of the kinds of information generated 
through a first-year data audit is simply to paint a broad picture of what is 
happening in a particular setting or for a targeted population. In contrast to 
problem identification, in which very specific pieces of data are used as indi- 
cators that point to an underlying condition or phenomenon, the objective here 
is to flesh out a situation as completely as possible using as much information 
as possible. An example for the first year might include an in-depth look at the 
experiences of male students of color, drawing on data about basic patterns of 
persistence and coursetaking, questiormaire data on attitudes and perceptions, 
and data about participation in and reactions to first-year programming. 
Presentation of such results usually emphasizes how the various individual 
pieces of data fit together to yield a comprehensive and integrated “story” of 
what is happening. Consistent with this emphasis, qualitative data drawn from 
observations and interviews are often used in conjunction with statistics — 
both in order to expand the portrait of experience being created and to ren- 
der the presentation more “real.” 

• Informing Discussion . Because academic settings are highly participatory, 
decisions are often long in coming and discussions of opinions and options are 
frequently long and arduous. Concrete data are useful in such settings to focus 
discussion and to close off obviously unproductive lines of thinking. At the 
outset, for example, a concrete piece of data about a student experience or 
about the effectiveness of a particular program element can generate a far more 
focused and useful discussion of what might be done rather than a vague 
feeling that “something is wrong.” At least as important, using data judi- 
ciously can also help guide a wandering discussion and can discipline it so 
that uninformed opinions are less dominant. Committees are a fact of life in 
the academy, and many first-year activities are governed or advised by them. 
Using data to frame and steer committee discussions in productive ways 
(away from mere anecdotal stories) can thus be especially important. 



DATA AUDIT AND ANALYSIS TOOLKIT 



• Selling Decisions . Decisionmaking is always complex, and decisionmakers 
rarely make a decision only on the basis of formally supplied information. 
Additional factors will always include political climate, perceptions of poten- 
tial impact, and a good deal of plain “gut feeling.” Nevertheless, given this 
complexity, data are often useful in explaining a decided- upon course of action 
after the fact. This strategy helps mobilize support for the decision, and 
allows the decision to be easily explained to those not involved in making it but 
whose “buy-in” is nevertheless important. At the same time, information can 
be especially important in making a case to funders that a particular program 
or line of work is critical. While seemingly cynical, this use of information 
is nevertheless important in the real world of academic decisionmaking and 
those responsible for first-year-of-college programs ignore it at their peril. 

Strategies 

There are also a number of proven tactics for using information in productive 
ways on campus and for getting people involved in looking at data. Among the 
most useful are the following: 

• Expectation Exercises . One of the most frequently encountered reactions 
when sharing a piece of information with a campus audience is, “I already 
knew that.” This response may occur because individuals want to feel that 
they grasp situations fully, even though they may not have thought much 
about them in advance. Partly it is because the human mind is good at think- 
ing up explanations for things after the fact — and thus not being “surprised” 
by them. But this reaction often gets in the way of acting on information in 
real-world situations. One way to counter it is, before the results are 
revealed, to ask those involved what they think the result of any analysis or 
data-gathering exercise is going to be. (For example, if you ask faculty at 
many regional state institutions what the mix of degrees granted in a year 
might be their answers are often heavily weighted toward the liberal arts.) 
This exercise makes participants think concretely about consequences and 
possible actions from the outset. More importantly, it provides a baseline 
against which the actual results can be compared, once they are distributed. 
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(Continuing the example, the reality of the degree mix for regional state 
institutions is usually heavily weighted toward business and education 
degrees — ^professional, rather than liberal arts degrees.) Differences between 
the forecast and the truth often provide a springboard for discussions about 
action implications because people are surprised and more likely to then be 
drawn into the discussion. See Figure 1 for an example. 

• Discrepancy Studies . Along the same lines, data are often most powerful in 
generating interest or in starting discussions when they are packaged around 
a discrepancy. Discrepancies can be of many kinds, for instance, between: 

✓ Expectations and actuality (as above), 

✓ Established targets and actual performance, 

✓ Aspirations and reality, 

✓ Standing policies and real behavior, or 

✓ One population group and another. 

But by their very nature discrepancies tend to command more attention than 
just presenting a number. A particularly powerful way to start discussions 
about advising, for instance, is to present data on student course-taking 
behavior that suggest established prerequisite policies are being violated and 
that students are failing subsequent courses as a result. 

• Beginning with a Recognized Problem . Most people are not interested in 
data for its own sake. As a result, it is often a challenge to build support for 
a campus-wide project whose sole objective appears to be to improve data 
resources. Instead, it is usually better to begin such efforts with a presenting 
problem that is apparent to everybody — for example, widespread academic 
failure among first-generation students, visible shortfalls in quantitative rea- 
soning skills among entering students, or uneven teaching quality in multi- 
section courses. Obviously, such presenting problems will be different on 
each campus and cannot be predicted. Indeed, the “demand” side of the data 
audit process is often useful precisely because it unearths such examples. 
Once identified, much of the effort can then be packaged around the need to 
address such concrete, widely recognized problems rather than based on just 
a vague need for better data. 
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• Creating Public Opportunities for Discussing Data . For similar reasons, 
many campuses have found it valuable to create highly participatory occa- 
sions to discuss the implications of data findings. Such discussions can 
involve broad cross-sections of the campus community or be limited to those 
directly involved in running programs and are often conducted during non- 
peak scheduling periods in retreat settings. One public university, for exam- 
ple, holds a summer planning retreat each year with broad participation from 
faculty and program staff At the retreat, a few key data findings are pre- 
sented and participants break up into small working groups to brainstorm 
ideas about what might be done in response. Results of these sessions are 
then shared and discussed, and become action priorities for the coming year. 
Many variations on this theme are possible, but all involve presenting select- 
ed statistics, then gathering a group of people (including students) to discuss 
their implications. 

• Avoiding Data Overload . Many analysts err in the direction of trying to 
report too much when they present findings- — either in report form or in pub- 
lic occasions such as those noted above. Analyses should be comprehensive 
and thorough but it is usually better to release a few carefully chosen find- 
ings, organized around issues or problems that are important, rather than 
present a “data dump.” Answering the inevitable questions that a limited set 
of findings will generate and thus initiating a “data dialogue” is the best way 
to get people hooked on information. 

A final point about building cultures of evidence is that action and follow- 
through are the most important conditions of all. Few people are interested in 
investing in information if it is clear that nobody will act on it and that nothing 
will change. Conversely, one of the best ways to promote involvement is to 
actively demonstrate that change is intended and possible. As a result, it is fre- 
quently useful to undertake reasonably small projects at first, where follow- 
through can be demonstrated immediately to potentially doubting constituencies. 



THE FIRST YEAR OF COLLEGE 



Why Are First- Year Data Important? 

The first year of college is a confusing time for students, faculty, and college 
personnel. Whether at a community college or at a four-year institution, multiple 
programs are often in place, offered to different types of students, creating multiple 
experiences with many different types of interactions. Cause and effect is always 
an issue. Determining which programs and which interactions have beneficial 
effects for which groups of students is often difficult to figure out. We need a lot 
of data, often from disparate systems or offices, collected systematically, and 
organized appropriately in order to conduct such analyses. 

The first year is also a logical place to anchor the development of a wider 
institutional assessment effort. Data collected on the first year of college can be 
the foundation for expanded data use and analyses on the entire institutional 
experience as warranted. Though complex, the first year usually consists of a 
well-delineated set of experiences for an easily identified set of students. It is, 
therefore, a manageable place to start when building an evaluation capacity at any 
institution. Furthermore, it makes chronological sense to begin a larger longitudi- 
nal study of student experience with the first year. Once baseline data about the 
characteristics and experiences of an entering cohort of students are assembled, it 
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is possible to continue to capture information about these students throughout 
their academic careers. 

Finally, information about the effectiveness of first-year-of-college programs 
gives program directors an important resource to make the case for which programs 
to continue and target for possible expansion and which to discontinue. First-year-of- 
college programs often comprise politically fi^gile and specially-ftmded activities, 
so evaluating effectiveness is critical to proving their ultimate worth. Data must 
be presented in ways that facilitate discussions about future investments. From a 
wider perspective, such discussions may simultaneously help to develop a “culture 
of data use” on campus for the long term that will aid not only first-year but other 
activities as well. 

Institutional Questions About the First Year of College 

How should we analytically untangle the many elements of the first year of 
college and dissect what makes it work? Underlying this master question are four 
more focused questions having to do with: 

A. What is planned for the first year of college? 

B. Who is involved in the first year of college? 

C. What happened (and where) during the first year of college? 

D. What mattered (and why) during the first year of college? 

A. What Is Planned for the First Year of College ? 

An initial question to be asked has to do with identifying the objectives of the 
first year of college at your institution. Even more basically, one might ask whether 
the first year of college is conceived as an integrated and intentional set of experi- 
ences that students are actively advised through and participate in. In initially 
establishing first-year-of-college programs, most institutions will have already 
answered this question in the affirmative. Given the existence of a “program,” 
though, are its objectives clearly defined? Like learning outcome statements for 
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a curriculum, it is important to define the objectives of first-year activities 
specifically in terms of: 

• How individual students will be different, 

• When that difference is expected (initially or after participation is complete), 
and 

• What students will be required to do in the first year of college. 

Instead of being defined for individual students in this manner, program objectives 
are often framed more generically in terms of what the institution will do or what 
the institution wants to happen for the student body as a whole, or perhaps what an 
institution wants to happen for an identified group of students. Defining objectives 
for a student body as a whole rather than for individual students should be avoided 
because it is far less useful in providing guidance for assessment and evaluation. 
Who should be involved in designing learning objectives for the first college 
year? Stakeholders to be involved would probably include student affairs profes- 
sionals, departments and faculty teaching first-year courses, and residence hall 
staff where appropriate. Once objectives for the first year of college are defined, 
then it is necessary to clarify their meaning and implications with the groups 
responsible for the various activities. 

A primary objective of first-year-of-college programs is to ensure continued 
student success. It is important to emphasize that proof of this objective is always 
found after the fact. It is manifested in what happens next for the student at the 
institution and within the curriculum, for example, persistence and ultimate grad- 
uation, actual levels of student performance in subsequent academic coursework, 
and the achievement of particular learning outcomes. 

Potential outcomes for students in their first year of college include, but are not 
limited to: 

• Developing foundational academic skills such as quantitative, writing, 
speaking, technology or information literacy skills. 

• Learning how to “negotiate” college and the collegiate culture. 
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• Managing academic life and good practices such as what constitutes 
scholarly work and the difference between primary and secondary sources. 

• Developing appropriate non-cognitive abilities and attitudes like motivation, 
self-worth, and respect for others. 

• Learning how to balance academic work with social life, and often, family 
responsibilities. 

• Developing approaches to critical thinking and problem solving appropriate 
to a variety of academic disciplines. 

Each of these possible outcomes suggests a particular evaluative line of inquiry 
and a specific set of data sources that might be tapped. In addition, the first year 
of college is often a testing ground for innovative practices that might be extended 
throughout the college experience if they prove effective. Examples of such practices 
include peer mentorship and collaboration, problem-based learning, and hands-on 
engagement with subject matter. Given their potential wider significance, it is 
always wise to evaluate the impact and effectiveness of such innovations in some 
detail. 

B. Who Is Involved in the First Year of College ? 

It is important to identify the specific characteristics of the students and faculty 
who participate in the first year of college. While we may think we know our 
students well, we often harbor unexamined assumptions about their backgrounds, 
attitudes and capabilities. For example, we will probably want to know a good 
deal about the following: 

• Student demographic characteristics like gender, race and ethnicity, age, 
disability status, family background, and whether students’ parents attended 
college. 

• Previous educational experiences and achievements of first-year students. 

• Student educational and career aspirations, attitudes toward attending college, 
and areas about which first-year students are apprehensive or expect to 
encounter difficulties. 
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• Characteristics of the faculty and staff who work with first-year students 
including demographics, professional background and experience, and what 
they expect of students. 

These factors can often interact with one another in complex ways to create 
specific populations of students and staff. For example, knowing that at an insti- 
tution the “average age of an incoming student is 25” often masks the fact that 
there may be two distinct populations — one of 1 8-year-olds and the other com- 
posed of more mature students — who are likely to behave very differently. While 
such issues might simply be a problem of data presentation, they can also be an 
“institutional myth” that could be addressed by further data disaggregation. 

The key is to always remember that real students, faculty, and staff, who bring 
a broad cross-section of diverse experiences and perspectives with them to the 
institution, populate the first year of college. Unless we know a good deal about 
these experiences and perspectives, it will be hard to figure out what is going on. 

C. What Happened During the First Year of College, and Where Did It 
Happen ? 

The question of what actually happened to students during their first year of 
college is rarely asked systematically. Instead, we tend to assume that all first-year 
programs were implemented as planned and that the experiences of all students 
were uniform. But this is frequently not the case. Some experiences are planned 
and explicit while others are spontaneous, amorphous, and random. An opera- 
tional mantra that should therefore continually be kept in mind is, “Adopt the 
student’s point of view.” This essential change of lens from our perspective to 
the student’s perspective is critical to determining what really happened to whom. 
It requires not “looking at” students but instead “looking through” students’ eyes 
to determine the actual behaviors they engage in when they encounter and act out 
the programs we put in place, as well as what experiences they brought with them 
to the programs. Sometimes the only way to get the answers to such questions is 
to “walk the process” by putting yourself in the student’s shoes and duplicating 
and documenting each step directly. For example, one such analysis at a large 
university revealed that students were often missing the first ten minutes of 
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several of their classes simply because they could not get across campus from 
their last class fast enough to show up on time. 

By adopting the student’s perspective, most people discover that what actually 
happens to students in the first year depends a lot on the successful implementation 
of programs and courses as planned. However, few activities or programs are 
actually implemented as planned. Programs often show little impact when evaluated 
because they were never successfully implemented, not because they were 
inherently ineffective. For instance, if a part of a first year program centered on a 
particular instructional strategy (attending a ropes course or use of a new software 
product) that was not available until halfway through the term, that is an imple- 
mentation problem. As Joan Stark, professor at the University of Michigan, has 
pointed out, there are always significant differences between the design, the delivery, 
and the resulting student experiences associated with any curriculum (Stark and 
Lowther, 1986). Therefore, it is necessary to look for what interfered with full 
implementation or what situations arose that altered the original implementation 
plan. 

Three specific syndromes common to the implementation of any program, 
including those in the first college year, often contribute to this problem and 
should be anticipated: 

• Piecemeal development of programs and program elements that do not fit 
together very well. Often this approach results in duplication of efforts or 
gaps in service that are very apparent to students but not always obvious to 
faculty and administrators. 

• Rushing to implement any new design. This situation often introduces a good 
deal of unintended variation in the way programs are implemented across 
departments, units, or locations — resulting in uneven (or even contradictory) 
effects. 

• Adoption of a “true believer” stance that assumes automatically that certain 
things must be effective (e.g., small classes, collaboration in all circumstances, 
etc.). This attitude is often an admirable characteristic of programs about 
which people care deeply, but unexamined assumptions about effectiveness 
may mask real difficulties in implementation or design. 
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All three syndromes suggest devoting much more attention to questioning our 
assumptions about first-year programs from the outset. In contrast to what you 
think might have happened, it is always wise to check these assumptions out with 
real data. 

D. What Mattered During the First Year of College, and Whv Did It Matter ? 

The question of impact, of course, is ultimately what we want to get to in any 
analysis. Hopefully first-year experiences result in identifiable and beneficial 
changes in behavior, attitudes, and cognitive abilities that are consistent with 
program goals. The analytical task associated with answering the question “What 
mattered?” is to look for longitudinal paths of student learning and development 
through the curriculum and extracurricular activities that are consistent with the 
individual student outcomes that you want to achieve. This task requires an 
essential shift of perspective from a “still photo/snapshot” view of college life to 
a “moving picture” perspective that emphasizes development and attainment. 
Doing so enables us to look for different patterns of student movement and flow 
through the college experience that are created by interactions among the formal 
curriculum, co-curricular activities, and students’ own extra-collegiate experiences. 
Taking this perspective introduces many behavioral questions that need to be 
addressed, such as: 

• In what order do students take particular classes and co-curricular activities, 
and how frequently do they participate in particular experiences? 

• Do students actually follow the advice given to them in advisement, and 
what difference did it make? 

• What kinds of experiences mattered most for what kinds of students in terms 
of cognitive or affective development? 

Ultimately, of course, the question of “what mattered” needs to be addressed in 
terms of intended outcomes and program objectives — which is why it is so impor- 
tant to be precise about these in the first place. The first year of college may also 
have many unintended or unplanned consequences for students, both for the better 
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and for the worse. As a result, it is always wise to build flexibility into databases 
and analyses with the expectation that the unexpected will happen. 

Given these demands for evidence to document the first college year — and the 
failure of most institutions to systematically determine who is involved, what is 
happening, and what mattered in this period — it pays to be systematic about 
assembling data resources. Techniques for doing so are the central concern of this 
Toolkit. Going beyond technique, the basic mindset of questioning assumptions 
and of constantly posing and addressing the four basic questions discussed in this 
section — what, who, what happened, what mattered — will always be helpful. 



THE DATA AUDIT 



What Is a Data Audit and Why Do It? 

Data Audit The process of identifying data resources and uses 
wherever they may be within an institution and gathering them 
into a useable information system. 

The basic objective of a data audit is to identify and inventory data sources and 
needs across the campus. Information derived from the audit can then be used to 
design and create a flexible analytical database suited to conducting a range of 
analyses about the first year of college on an on-demand basis. Such a database is 
most useful if it is separated from the regular student information system kept by 
the registrar. By their very nature, the data contained in live transactional data- 
bases — like admissions or registration systems — change every day. Therefore, using 
such data directly to examine students and their behaviors analytically has many 
drawbacks. In order to move from a view of students based solely on glimpses at 
the student information system, we need instead to continue to capture “snapshots” 
of student data that contain carefully defined subsets of data at periodic intervals 
and archive them for later analyses. These analyses will often require using these 
“snapshots” in combination to create a “moving picture” that approximates student 
movement through the curriculum. Determining which particular pieces of data, 
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or data elements, to capture in this manner — and where they can be found — is a 
primary objective of the data audit. A summary of data audit steps is included at 
the end of this section. 

Put simply, a data audit allows an institution to take stock of and then mobilize 
its data resources. All colleges and universities should want to take this action 
with regard to the first year of college for the reasons presented in this document: 
a) “generic” programs are seldom useful for real (and therefore different) types of 
students; and b) factors that affect one sector of the student population may not 
affect another, resulting in differing implications for both policy and intervention 
strategies. The capability to analytically disaggregate the student population to 
determine what works for whom is therefore critical. 

Elements of a Typical Data Audit 

A data audit consists of two primary activities: 

• Examining existing data sources at the institution wherever these may be 
found, and 

• Determining those data that are most critical for evaluation, assessment, and 
decisionmaking needs. 

These two activities can be thought of as building campuswide understanding, 
respectively, of the “supply” of data and the “demand” for data. Conducting a 
data audit thus involves identifying data sources, creating data inventories, and 
documenting data collection methods and routines already in place. Examining 
management and decisionmaking needs, in turn, requires determining schedules and 
formats for submitting data or information to external constituencies (e.g., accreditors 
or the state) and determining whether there are management needs for information 
that are not currently being fulfilled. 

While there are many different ways to conduct a data audit, these two over- 
arching purposes — to determine data sources and data needs — should always 
guide what is done. Once completed, the information gathered during the data 
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audit can be used to help restructure current management information systems. It 
can also assist you in locating additional points of contact with key constituencies 
(students and faculty) that might be better used to collect pertinent data. Above 
all, results of a data audit provide the basic ingredients needed to create the 
database (or databases) required to conduct ongoing in-depth analyses of the 
effectiveness of the first college year. 



Who Should Be Involved in Planning and Carrying Out 
the Data Audit? 

A data audit can be conducted by individuals or groups, but usually proceeds 
under the guidance of an institutional or unit-wide committee. Participants typically 
consist of institutional researchers, academic planners, student affairs professionals, 
student advisors, faculty and administrators. In addition, it is usually wise to have 
different perspectives represented on any team that either conducts or oversees a 
data audit. Involving individuals who are directly familiar with particular data 
sources because they use them every day — like people from the registrar’s office 
or institutional research — is always beneficial. It is also useful to involve some 
people who are entirely removed from data processes — for example, some student 
advisors, faculty or administrators. Such individuals will often benefit the data 
audit by bringing fresh perspectives to bear on the process, and they will benefit 
directly from knowing how particular kinds of data are kept at the institution. At 
the same time, they will acquire greater sensitivity to the fact that the information 
demands that they often make can be technically challenging or, under current con- 
ditions, impossible to meet. Also, those individuals who gather and maintain data 
will see that their information is important and will be used by others; therefore, 
they may make more of an effort to keep their data well maintained. 

The person or persons chosen to lead the committee should have broad support 
on-campus, particularly from upper-level administrators, and have a clear under- 
standing of the purpose of the data audit and analyses for the first year of college. 
Often institutions have co-chairpersons — one with strengths in either academic or 
student services and the other with strengths in technical areas. 
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When Should A Data Audit Be Done? 

Data audits are usually done on the occasion of some other major activity. 
These occasions can include (but are not limited to) accreditation self studies, 
consideration of new transactional systems (notably student systems, but also 
personnel systems), when building a data warehouse or data mart, or when new 
assessment personnel, institutional researchers, or first year coordinators are 
hired. Although many institutions find it useful to conduct a data audit as a part 
of or in support of these larger activities, it is not necessary to do so. A data audit 
can be done just because it seems like a good time to find out what data exist on 
campus and where they are located. Having said that, a data audit does not need 
to be conducted every year. It often works out that a three- or five-year cycle is 
sufficient. Among pilot institutions, universities preferred a 5-year cycle for data 
audits, and community colleges, because circumstances change more frequently 
there, preferred a 3 -year cycle. Subsequent data audits can use results from the 
first data audit as a foundation. 

The Right Attitude 

A fresh perspective and an open attitude are important when people conduct 
a data audit at an institution with which they may be well acquainted. One 
advantage of having internal personnel carry out the audit is that they will already 
know many of the vagaries of existing systems. There can be disadvantages to 
using “insiders,” though, including blindness to the existence of unofficial data, 
unwillingness to listen to other viewpoints, and an inability to probe deeply and 
consistently to determine whether data are defined differently in different places in 
the institution. Those involved in data audits should therefore constantly monitor 
their own assumptions and viewpoints to avoid these pitfalls. 

Similarly, conducting a data audit typically uncovers a range of attitudes on the 
part of those who collect and keep institutional data. Some will be eager to show 
what they have and will be happy to work with you to determine how a wider 
range of people on campus could better use the data for which they are responsible. 
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Others will be highly protective of the data for which they are responsible, and 
may view audit questions as a threat to their functions and independence. An audit 
team needs to be aware that there are sometimes good reasons for this attitude. 
For example, free access to some data (e.g., health records or financial aid infor- 
mation) may violate privacy guidelines, and keepers of these kinds of data can get 
into trouble (and even be prosecuted) if they allow unlimited access. Others may 
fear that people unfamiliar with how data elements are collected, defined, and 
constructed will misuse the data. Still others may simply be protecting their 
autonomy, or covering up poor performance. In all such cases, be sure to listen 
carefully to their concerns, understand what really lies behind them, and make 
appropriate compromises. 



A Note About Confidentiality 

Student data are confidential. The Federal Educational Rights and Privacy Act 
(FERPA), also known as the Buckley Amendment, protects individually identifiable 
data from public scrutiny. In the course of a data audit, no individually identifiable 
data need to be or should be shared with others. The focus of a data audit is on 
the overall databases and their data elements, not on any specific individual data 
kept in those databases. If you are unclear of how your institution and state enact 
FERPA, consult with the institutional researcher on your campus. They will be 
well versed in what is allowed or not. No part of this data audit will require you to 
engage in any violations of FERPA. 



The Supply Side 

Official and Unofficial Databases 

Keep in mind that there are often two basic kinds of databases at any institution: 
official and unofficial. Usually “official” data— that required for federal or official 
institutional reporting to the state— is centrally maintained and kept and “unofficial” 
data is maintained and kept by decentralized units. Many units gather data to 
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address their own internal needs and to meet unique or special reporting require- 
ments. At larger institutions such “guerrilla databases” are often kept in unit-level 
computer systems rather than in official mainframe database files. Prominent 
examples include advising data, assessment data, placement data, and responses to 
student questionnaires — either institution-wide or specific to a unit or program. Or, 
in the case of qualitative data, student writing samples may be kept in electronic 
portfolios or as hard copies kept in filing cabinets. It is, therefore, important to 
look especially hard for these unofficial data sources when conducting a data 
audit, in order to make sure that key data elements are not overlooked. To uncover 
such sources, visit departments and units in person to ask about what data are kept 
and reported to external constituencies. 

Types of Data 

Data are gathered by multiple units and for multiple purposes throughout an 
institution. An illustration of possible student services units and offices that might 
have data relevant to the first year of college is provided in Figure 2. Figure 3 
lists the types of data about first-year students and their experiences that are 
typically kept by the principal student services offices and units listed previously 
in Figure 2. Note that there is some duplication and overlap in this listing because it 
is typical for different offices at an institution to collect the same kinds of information 
independently. For instance, the Testing Office, as well as the English department, 
may keep English placement data; the Counseling office, as well as Admissions, 
may keep information on parents’ education. Where this is the case, it is important 
to determine if they do so consistently and to then document any differences. Since 
the first year is influenced by both student services and academic affairs, a similar 
listing of pertinent academic affairs offices and the types of data they might collect 
can be found in Figure 4. 

Note also that these lists are far from exhaustive. Not all of these data may be 
gathered at your institution, your institution may gather additional data, or the data 
listed may be collected by offices different from those listed in Figures 3 and 4. 
But with these caveats. Figures 3 and 4 can be used as protocols for looking for 
particular kinds of data when conducting a data audit. 



ERIC 

MMfflffTIILillJ 



24 



40 



THE DATA AUDIT 



Transactional Data 

A data audit will also allow you to uncover and capture transaction-based data 
that are regularly collected by a unit to monitor its own operations. This so-called 
“footprint” data is gathered from students as they move through and utilize a variety 
of units on campus. Examples include data on bookstore and food service usage or 
data on student contacts with and utilization of counseling, advising, or tutorial 
offices. Cataloging this kind of footprint data makes it available for wider use and 
analysis and may eliminate the need to collect information about utilization via 
surveys or other special sources. Transaction-based data also have the advantage 
of being more complete than survey data because they are usually available for 
the entire student population affected, reducing the kinds of sampling or response- 
rate problems associated with using special-purpose questionnaires. The main 
disadvantage of footprint data is that they may not be about the topics that really 
interest you. Furthermore, they are often kept in intractable or inaccessible formats 
and places. You will discover the degree to which this is true at your institution 
while conducting the data audit. Even if such data are not eventually tapped for 
analysis, it is important to know that they exist and whether they are being kept 
consistently with one another with official institutional definitions. 

A data audit of the first year needs to be limited in scope. It must focus on what 
occurs during the first year of college. Although some people may be interested in 
looking more in-depth at the preadmission information such as data on first contacts, 
etc., that is not necessary unless an institution wants to include enrollment 
management data in their analyses. 

Who Collects Data and Why? 

The next point to determine is which units keep which data. Units and offices 
scattered across the institution often keep similar data. More often than not, they are 
unrelated to one another, cannot be linked, and may be based on slightly different 
definitions. Similarly, units may analyze data in different ways to achieve different 
ends. During the data audit each of these points needs to be documented. If you 
find that multiple units keep virtually the same data but collect it independently. 
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the institution might want to consider establishing a centralized method for 
addressing common needs more efficiently. By doing so, consistent definitions 
can be used across campus and the burdens of duplicative data collection can be 
minimized. 

Actually conducting the “supply side” of the data audit involves physically 
visiting each office or location that collects or maintains data, using Figures 3 and 
4 as guides. Directors of offices and units should be apprised of the data audit and 
why it is being conducted, but often it is associate directors, data analysts, or 
researchers in an office who know the details about data. Oftentimes, it might seem 
easier to send out a survey or an email inquiry with these questions, but we advocate 
face-to-face interviews in individual’s offices — “walk throughs” — for the following 
reasons: 

• It creates a collaborative atmosphere for the sharing of data and data sources. 

• It honors office personnel and the importance of their efforts. 

• It indicates an interest in office personnel and what they are doing. 

• It builds a relationship with these individuals and with the office. 

• It allows you to read reactions from individuals and see the office set up. 

• It allows you to do immediate follow-up and collect artifacts. 

• It also allows for “serendipitous” meetings and discussions about data and 
databases, including guerilla databases. 

When you visit each administrative office, academic department, or unit, it is 
important to determine: 

• What kinds of records, data, and databases it keeps on first year students, 
programs, experiences, and activities. 

• How data collected are used by the unit. 

• What schedules govern when data are collected, and if and when data are 
entered into computer systems. Extracts from live databases are often taken 
on the tenth day of a term and a given time period (such as one week) after 
the end of the term. 
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• What surveys it administers, to whom (all first-year students or a particular 
subset of first-year students), and on what schedule. 

• What additional local data collection efforts it engages in with regard to first 
year students, courses, programs, experiences, and activities. 

• What questions the unit would like to be able to answer about the first year 
of college. What data would be needed that are not now collected. 

• What are unit staff perceptions about gaps in the data and information that 
they possess on the first year of college. 

• The extent to which available first-year data sources and databases are 
underutilized, and whether unit personnel have ideas about why this might 
be the case. 

Furthermore, while conducting the audit you need to ascertain across units: 

• Whether the records, data, and database structures that these units and offices 
maintain differ from one another, and exactly how they differ. 

• The extent to which definitions for common data elements vary across units 
and departments. 

• The extent to which formats in which common data elements are kept vary 
across units and departments. 

While conducting the audit, it is often helpful to collect documentation about 
the data that each unit controls. Artifacts or documents to consider collecting from 
units when you visit them include: 

• The actual forms or questionnaires used to gather and record data. 

• Data element dictionaries. 

• Data element definitions (if not included in the data element dictionary). 

• Database structures and file formats used to archive data (for example, are 
data kept on CD-ROMs, in mainframe files, in Access databases, in old 80- 
character length fields). 

• Notes on specialized software needed, if any, to access and use data. 
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Notes about the actual coverage, timing, and completeness of the data should be 
organized by type of data, following the logic of Figures 3 and 4. Forms, artifacts, 
and documents should be numbered and keyed to the text of the data audit report. 

How Complete Are the Data? 

The completeness of the data gathered by the institution and its individual units 
is critical. Data on a given topic are sometimes collected for only a portion of the 
entering student body — from those who attend orientation, who came to class on 
a particular day, or whose admissions files came through the regular admissions 
process, for example. It is, therefore, important to follow up with units about com- 
pleteness by asking them the following types of questions: 

• Are individual students required to fill out and answer all of the data elements 
on every form — either paper or online — such as admissions, registration, and 
housing forms? Or, when applicants fill out admissions forms are they told 
to fill out only certain information on the sheet? 

• Are data elements transferred from paper or online forms into databases? 
Who does this? Do data entry clerks do it? Are the forms scarmed automati- 
cally? Does the system load online entries directly as data elements into a 
database? What is the schedule for accomplishing these entries? If data entry 
is done by hand or if forms are scarmed, are critical data elements entered 
immediately and other, less critical data elements entered later in the term when 
there is less pressure? What happens to forms after data are entered? Do any 
checkpoints in the system exist to ensure that all data entry is completed? 

• Are there fields that are never entered into the database at all, even though 
the information is supplied or forms filled out? 

• Are individuals asked for detailed information on a form, but then upon 
data entry is the relevant data element collapsed into a “Yes/No” or other 
summary format? 

• Are data elements entered directly into the live student information database 
or are they entered into an intermediate database (e.g.. Access or Excel) and 
then loaded? What office does this? 
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Some institutions also find it helpful to run frequency checks — a summary of 
what numbers actually populate the fields and how often they each occur — of 
individual data elements that are not often used in order to determine directly the 
extent to which all students have entries and how error free these entries are. 
Using a simple example, a frequency check on the field listing “gender (or sex)” 
might contain Ms and Fs in addition to Is and 2s. A frequency check would also 
give an indication of how many persons in the file had no record of their gender. 
Sometimes, for example, computing center personnel will say that they “maintain” 
a given data element but later probing will reveal the fact that nobody loads data 
into the field any more, or that the database fields contain unusable data. 

Where Do the Data Go? 

Once it has been determined which data elements each particular unit gathers, 
the next step is to determine where data elements go after they are collected. For 
each data element (or group of data elements), ask personnel in pertinent units: 

• Which databases do these data elements go into? Are certain data elements 
put into multiple databases? 

• How are these entries in other databases updated? On what schedule and who 
is responsible? Are old values over-written in this process? 

• Are fields used for multiple purposes? Are different offices using supposedly 
“unused” fields for different purposes and including their own data elements 
and codes? 

• What definitions are used for various data elements used in multiple databases? 

• Who has authority over these databases? 

Results of this portion of the data audit are often best documented in terms of 
a map or flow chart showing clearly how and when particular data elements move 
from point of collection to the various places where they are archived or used. 
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“Walking the Process ” 

In order to accomplish these various steps, it is frequently useful to physically 
“walk the process” of collecting data. One way to do this is to adopt the student’s 
(or faculty member’s) perspective and go through each step that has to be accom- 
plished in order to complete a particular action — to register for a class, or to obtain 
financial aid, for instance. Determine the specific forms that students have to 
complete for which units across the campus in order to attain their objective. 
Follow up on each data element (or group of data elements) using the questions 
listed above. 

A pilot institution cleverly combined this aspect of the data audit with their 
ongoing institutional commitment to service. Staff members selected actual students 
to go to specific offices to “walk the process” to collect data and information for 
the data audit as well as to gather information about how well they were treated 
and experience the customer service skills of the personnel in the various offices 
visited. 

Another way to “walk the process” is from a data element’s point of view. This 
will allow you to determine which units gather particular data elements (and 
identify any redundancies), which database(s) particular data elements are kept in, 
what definitions and categories are used in which databases, who is responsible 
for each data element, and who is using that data element to what end. 



Supply Side Summary: What Is Important to Gather During this Process? 

When gathering information about data on campus, it is important to collect as 
much documentation as possible about data and databases that exist. The following 
types of support documentation will be especially useful; 

• Copies of forms, both paper and online. 

• Data element dictionaries for databases. 

• Data element definitions (if not included in the data element dictionary). 
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• Documentation on the structure of databases and the format(s) in which 
individual data elements are kept. 

• Information on historical database files and how many years of data are 
available. 

• Security guidelines and change procedures for all of the databases encoun- 
tered. That is, who has access to the data and who has authority for updating 
or changing the database or its data elements? 

The Demand Side 

The other aspect of doing a data audit is to determine what data needs exist 
on your campus. This aspect is best conceived of as the “demand” side of the 
analysis, complementing the “supply” side represented by the inventory of existing 
data sources. In addition to talking with individuals about the data they collect, 
you will need to talk with institutional decisionmakers and other data users. Often 
there is considerable overlap among individuals and offices that are data users and 
data collectors, but do not assume that there is. 

Offices, units, and individuals on campus that need to be contacted about data 
needs include academic affairs personnel — the provost, deans, and department 
heads — student affairs personnel, as well as individuals involved in accreditation 
studies or who must report information to state or federal officials (directors of 
TRIO programs or teacher education programs, for example). Examples of external 
reporting that may be required would be to accreditors, to state agencies, or to 
governing boards. When you visit these offices and units, ask them: 

• Who are the office’s key internal and external constituencies? 

• What kinds of decisions does the office regularly make, and what kinds of 
information are needed (or desired) to make them? 

• To whom must the office report data and information? 

• What existing reports and data reporting requirements does the office have and 
whether it is able to fulfill them? Working backward from existing reports 
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and procedures, determine what data are needed and how calculations are 
made. 

• What kinds of reporting and decision cycles are typical? (For example, grant 
budget cycles can run on academic years, July-June fiscal years, or even 
October-September fiscal years, which will affect when data are needed.) 

• How current and accurate do data and information need to be? 

• What is missing that office personnel deem essential to have (that is, data 
they need versus data they want)? Are data missing because they do not exist, 
or is existing information not accessible to office personnel? 

• What questions should office personnel be able to answer about the first year 
of college? 

• What are their perceptions of gaps in data and information? 

Make sure to point out that even though you are asking them these questions, 
it does not mean the data audit will result in complete resolution of issues that 
are raised or that all their data desires will be met. Instead you should explain 
carefully that the intent is to inventory information resources and needs to help 
decisionmakers at the institution decide how to proceed. 



Demand Side Summary: What Is Important to Gather During this Process? 

Just as on the “supply side” of the data audit, it is useful to gather as much 
documentation as possible when you visit each site. Documents that you should 
gather from these offices and units include: 

• Copies of recent reports that the unit has submitted using official (and 
unofficial) institutional and office-level data. 

• Copies of data-reporting requirements, including schedules and format 
specifications. 

Samples of the formats or methods the unit uses to analyze data (e.g., calcula- 
tional routines used to compute class loads, advising schedules). 
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Bottom Line: Summary of Procedures for Carrying Out a 
Data Audit 

Procedures for carrying out a data audit are summarized below. Please note, 
however, that while all these steps should be accomplished, it is important to 
be flexible in carrying out this task. Different institutions may require somewhat 
different approaches because of their organizational structures and politics. At the 
same time, some office or individual in the past (usually the Office of Institutional 
Research or its equivalent) may have previously accomplished much of the work 
included in a data audit. Where this is the case, it is useful to refer to this previously 
accomplished work as a starting point. Keep in mind that circumstances may have 
changed, there may be new office personnel, or something may have been overlooked 
in the process. 

1 . Identify offices and units across campus that gather or keep data pertinent 
to the first year of college, as well as those offices and units that use or 
report data. Emphasize that the data audit is a collaborative institutional 
process. 

2. Contact appropriate individuals who can fairly represent the resources and 
perspectives of these offices and units. 

3. Set up mutually agreeable times to visit these individuals in their offices in 
order to discuss data sources and data uses. 

4. Approximately one week prior to visiting, send these individuals a list of 
the questions to be discussed and the artifacts or documents you will want 
to collect from them. If a particular office is only a data-source office or 
only a data-use unit, adjust the list of questions accordingly. 

5. Conduct the site visit. Ask your questions. Clarify, clarify, clarify. Take 
detailed notes. Collect artifacts and documents. Where appropriate, “walk 
the process” by simulating the steps a student (or faculty/staff member) 
would take, or follow the path of a particular data element from point of 
collection through data entry, archiving, and use. 
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6. Before leaving, thank the people involved for their time and help. Invite 
them to contact you if they think of anything further that might be of use. 
Secure an agreement that should there be any follow-up questions, they 
would be willing to respond to them. Confirm their telephone numbers or 
email addresses. 

7. Send thank-you notes to people you visited and interviewed; it might be 
appropriate to copy their managers or bosses as well. 

In order to facilitate a culture of data use and information sharing on campus, 
consider making the findings of the first-year data audit available to the campus 
in the form of a brief report. 



OUTPUTS OF A DATA AUDIT 



After the “raw data” generated by the data audit have been assembled (interview 
notes or tapes or transcripts as well as artifacts documenting existing data and 
reports) from offices representing both data sources and data users, results should 
be synthesized to yield a coherent picture of data resources and the culture of data 
use at your campus. Many different ways of summarizing results are possible, 
depending upon institutional needs. In some cases, you may want to prepare a single 
comprehensive report on findings. In other cases, it may be more useful to organize 
findings around common topics — for example, lists of first year data resources and 
who has them, recommendations for a “common core” of data, and a report on the 
current culture of data use on campus. As noted earlier, it is also usually appropriate 
to prepare a brief summary of the project and its results for wider distribution to 
the campus community. 

Outcomes of the pilot study fit into four categories. Institutions found that they 
learned lessons about a) their first-year programs, b) broad data issues, c) how to 
improve data audit implementation, as well as d) refinements that could better 
their institutional infrastructures. Examples of lessons learned for first-year 
programs included: 

• Some institutions found that they did not have program goals for their first- 
year programs. 
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• Some were under the impression that there was more tracking of first-year 
students happening on campus than was actually occurring. 

• They found that the data audit raised awareness about the entering-student 
program. 

• Student engagement with services, particularly student support, was not 
captured on some campuses. 

• A few campuses discovered that student course evaluations were not linked 
or kept in a database. 

Examples of lessons learned about data and data use on-campus included: 

• Some institutions found that they do not enter all data “resulting in loss of 
potentially valuable data.” 

• The issue was raised of who will decide which data are entered when budgets 
are tight and personnel are already overly busy. In fact, data may not be 
entered; but as processes are increasingly automated, institutions should 
keep in mind entering more data as it can be done. 

• Initially, at one institution, staff wanted to eliminate data but by the end of 
the audit many wanted to gather more data. 

• First generation college attendee information was often collected for only a 
particular population of students. The same was true for email addresses. 

• At one institution, pilot project administrators discovered the value of data 
collection was challenged in student services areas because of the difficulty 
in seeing the cormection between data collection and improved services. 

• Data are often not coordinated, shared, or organized well. 

• One institution is now going to put its fact book on the web. 

• Both university and community college personnel were very cooperative in 
supplying needed data. 

• Problems were uncovered with not storing or archiving data, having no 
historical data, incurring large amounts of data loss, and finding that needed 
data were being overwritten or purged. 
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• At some institutions no data element dictionary existed. 

• When a college named a data element one way and another name was used 
for external reporting purposes, both names needed to be included in the data 
element dictionary to alert others to the dual name. 

• At one institution, the data audit confirmed what they already knew about 
their data and institutional data processes for the first year of college. 

• A few pilot schools encountered some resistance from the gatekeepers of the 
data. 

• One institution found that data were available in the data warehouse, but 
too little training was available on how to extract useful data creating an 
accessibility issue. 

About conducting a data audit, pilot institutions personnel found that: 

• Sending out questions/request for artifacts to be collected ahead of time 
meant that units had them available when they came to do the interview. 

• Use of worksheets aided them in the collection of information for the data 
audit. 

• In some cases, multiple interviews were necessary with different people in 
offices because no single person knew what was possible. 

• Results of the data audit will be used to prioritize future data needs. 

Generally, pilot institutions found that the data audit: 

• Uncovered questions in addition to answers. 

• Identified redundancies that could be eliminated or opportunities to be 
studied during the next round of strategic planning. 

• Will result in an institutional report back on outcomes of the data audit to the 
institutional management team, the President's Council, etc. 

• Helped to create an institutional mindset around a total university 
approach — assessing our effectiveness by finding and using available data. 
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• Use of the word “audit” scared some people. 

• Made some departments relieved that they were not being singled out for 
review — that this was part of a larger, institution- or unit- wide, project. 

• Led to increased understanding among committee members regarding what 
different departments do and how they fit into the overall functioning of the 
institution. 

• Gave people an understanding that these were issues other institutions were 
working on. 

• Was instrumental in highlighting the need for evidence in the form of data. 

• Was a way to involve faculty in data use. 

• Identified the need for an institutional Data Definition Committee. 

• Results were, in the words of one pilot institution administrator, “strikingly 
consistent. Most people expressed a fhistration with the difficulties encoun- 
tered in trying to get data and most people wanted access to the same data 
and were trying to create the same types of reports — all independently with 
absolutely no efficiencies of scale.” 

Whatever the format for reporting ultimately selected, the following outputs of 
the data audit should be fully described: 

• Data element lists and specifications including whether it is kept in text or 
numeric format, where the data element comes from, when it is entered or 
the frequency with which it is updated, how consistent is the coding, how 
have different units interpreted definitions (for example, does a “0” mean 
zero or missing?), etc. 

• File structures and extract schedules including when “snapshots” of live 
transactional databases are taken. 

• Most common uses of data on campus. 

• Common reporting formats/templates. 

• Review security issues. 
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• Review need to establish a Data and/or Security Committee. 

• Locus of responsibility for maintenance and control for different kinds of 
data and databases. 

• Recommendations on the various forms of user training needed to facilitate 
use of data resources. 

• Recommendations for methods and approaches for collecting needed data 
that are not currently collected by the institution. 

In preparing to summarize the outputs of a data audit, it is helpful to be aware 
of and review some frequently encountered findings of such an exercise at other 
campuses. Among them are: 

1. The need to reposition student databases to examine behaviors from the 
student rather than from the institutional point of view. It is not unusual for 
institutions to collect a lot of data on students and student behavior, but not 
to use this information to investigate questions like, “How did students act 
out the first year curriculum in terms of course-taking?” or “How many 
first-year students visiting academic skills centers did so more than once each 
term?” One reason for this situation is that existing databases focus on the 
needs of record-keepers, not information users. Therefore they are typically 
hard to access, hard to use, and organized cross-sectionally rather than 
longitudinally. 

2. Opportunities to collect data more systematically using processes already in 
place and existing points of contact with students. There is always a tendency 
to invent brand-new data collection efforts every time a new information 
need is identified. Also, administering surveys using different methodologies 
across terms and years can alter outcomes and results, which can create false 
perceptions of change. This situation leads to students being repeatedly 
surveyed. The point here is to be more deliberate about taking advantage of 
contacts/opportunities that are already available. Examples of these include 
student orientation sessions at which additional surveys might be collected, 
placement testing, student evaluation of instruction, and face-to-face 
advisement sessions. Emerging technology also provides opportunities. For 
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example, more and more libraries, bookstores, residence halls, and student 
service offices are using “Smart Card” or “Card Swipe” systems to record 
use and attendance, creating an automatically generated record of contact and 
intervention for each student that can be recovered and used more broadly. 
Or, for students who access offices or services online, web usage statistics 
are another form of data to be collected. In addition, being deliberate in 
gathering and using data will likely reduce duplication of effort on campus 
and wasted resources. 

3. Unclear or inconsistent definitions across units for similar data elements. 
This mismatch can occur in both directly extracted and locally constructed 
or calculated data elements. Every institution can benefit from having clear 
definitions for data elements and distributing documentation containing 
those definitions widely to everyone on campus. For example, offices may 
use different definitions of first-time students; some may use first-time, 
full-time undergraduates, others may use the entire population of first-time 
undergraduates, which would include both full-time and part-time students. 

4. Self-reinforcing “spirals” of misperception on the part of those responsible 
for collecting/archiving data and those who seek to use it. A frequent 
finding of a data audit, for example, is that user communities have given up 
trying to obtain some kinds of data because of the difficulty of getting it — 
resulting in a perception by data communities that there is “no demand” for 
these data by users. 

As you seek to summarize the results of the data audit on your campus, it is 
important to be sensitive to these common issues, and to be reassured that they are 
not unusual. Furthermore, by being open to suggestions, you may learn new 
avenues that data may be beneficial to all parties involved. 

The final step is to close the feedback loop to create a true culture of data use 
by communicating results of the data audit of the first year of college widely and 
taking action based on results of the analyses. 
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CONCLUSION 



Embarking on a data audit designed to support and improve the first year of 
college is a significant step for any campus. Hopefully, the data audit will lead in 
the direction of a more comprehensive and intentional approach to collecting and 
analyzing information about the first year of college. In undertaking it, we want 
to reemphasize some of the points made at the outset of this Toolkit. 

First, always remember that “truth” lies in the variations. Real people with real 
differences make up the first-year population at any college, and the same is true 
of all our faculty and staff. So avoid being misled by averages and other “central 
tendency” results that are meant to apply to all students and situations. Instead, 
disaggregate the data as far as you can to uncover the many differences in experience 
and situation that probably exist. 

Second, results of assessments and evaluations are almost always more useful in 
generating further questions and in stimulating reflective faculty/staff conversations 
than in “making judgments” about program performance. It will always be 
important to use available data to create occasions for further reflection and 
conversation about collective action, rather than employing data to point fingers 
and blame units or individuals for shortfalls in performance. Indeed, the metaphor 
of scholarship is usually effective in such situations: the object of evaluation is 
nothing more than to turn the tools and habits of systematic investigation that we 
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were all trained to practice in our disciplines onto our own core enterprise of 
facilitating student success. Like scholarship in any field, the process of gathering 
and analyzing data about the first year of college should be open, deliberative, 
systematic, and ongoing — never really completed. 

Third, consistent with the view that engaging in assessment and evaluation is a 
profoundly educative act, students should be involved in the process as fully as 
possible. The best data systems are designed not only to provide evidence to 
decisionmakers but also to enable feedback and intervention in individual cases. 
Indeed, the data audit process may uncover numerous opportunities to communi- 
cate information back to students about their own strengths and weaknesses, or to 
introduce such information into the advisement relationship. At the program level, 
moreover, student participation in the process of interpreting evaluation results is 
often especially valuable. For example, focus groups of students are frequently 
useful in helping to interpret observed patterns of student behavior or to provide 
in-depth commentary on survey results. 

Fourth and finally, the mindset required for sustaining such projects in the long 
term is one of continuous improvement. Those engaged in assessing and evaluating 
first-year-of-college programs should always bear in mind that no matter how 
good things are (or you think they are), they can always be improved. Finding the 
ways in which this can be accomplished is about details, not about “silver bullet” 
solutions that try to change everything at once. Real improvements take place by 
identifying and addressing individual classes of problems occurring for particular 
types of students all over the place. The mindset that such improvement is a collective 
responsibility in pursuit of a common goal — student success in the first year of 
college — is critical to this process, as is a common store of usable information. 
Hopefully, this Toolkit will be of help in creating or strengthening this resource. 
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GLOSSARY 



Anonymity (provision for): “Evaluator action to ensure that the identity of 
subjects cannot be ascertained during the course of the study, in study reports, or 
in any other way (Joint Committee on Standards for Educational Evaluation, 
1994 ) ” “Only when the sponsor cannot identify each person’s response, even 
momentarily, is it appropriate to promise that a response is anonymous (Dillman, 
2000, p. 163).” 

Confidentiality: “Answers are confidential. This statement conveys an ethical 
commitment not to release results in a way that any individual’s responses can be 
identified as their own (Dillman, 2000, p. 163).” 

Data: “Material gathered during the course of an evaluation that serves as the 
basis for information, discussion, and inference (Joint Committee on Standards 
for Educational Evaluation, 1994).” 

Data Audit: The process of identifying data resources and uses wherever they 
may be within an institution and gathering them into a useable information system. 

Data Element: Single, individual piece of data such as “name” or “race.” 

Face Validity: “The extent to which an instrument looks as if it measures what 
it is intended to measure (Nunnally, 1970).” “An instrument has face validity if 
decisionmakers and information users can look at the items and understand what 
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is being measured (Patton, 1984).” “It is obvious, on the face of it, that the proposed 
procedure is the best way of measuring the phenomenon of interest (Rutman, 
1984).” “Apparent validity, typically of test items or of tests; there can be skilled 
and unskilled judgments of face validity. Highly skilled judgments come pretty 
close to content validity, which does require systematic substantiation (Scriven, 
1991).” 

Footprint Data: Data that is gathered from a student or faculty member in the 
normal course of interacting with a postsecondary institution — e.g., data gathered 
on an admissions form, or on a form to have access to library resources. 

Goal: “A statement, usually general and abstract, of a desired state toward 
which a program is directed (Rossi and Freeman, 1993).” “An end that one strives 
to achieve (Joint Committee on Standards for Educational Evaluation, 1994).” 

Guerrilla Database: An unofficial database not normally known to the larger 
institution — e.g., database of student teacher experiences and mentors for 
Education students. 

Information: “Numerical and nonnumerical findings, renderings, or presenta- 
tions — including facts, narratives, graphs, pictures, maps, displays, statistics, and 
oral reports — that help illuminate issues, answer questions, and increase knowledge 
and understanding of a program or other object (Joint Committee on Standards for 
Educational Evaluation, 1994).” 

Needs Assessment: “Systematic appraisal of the type, depth, and scope of a 
problem (Rossi and Freeman, 1993).” “...is a process for discovering facts about 
the functions or dysfunctions of organisms or systems; it’s not an opinion survey 
or a wishing trip (Scriven, 1991).” 

Objectives: “Specific, operationalized statements detailing the desired accom- 
plishments of a program (Rossi and Freeman, 1993)” “Something aimed at or 
striven for, more specific than a goal (Joint Committee on Standards for 
Educational Evaluation, 1994).” 

Official Data: Data reported to federal or state agencies that must be exactly 
replicable. 
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GLOSSARY 



Policy Significance: “The significance of an evaluation’s findings for policy 
and program development (as opposed to their statistical significance) (Rossi and 
Freeman, 1993).” 

Sensitivity Analysis: The systematic analysis of the influence of various input 
values on the output of a model. 

Snapshots: To freeze data from a transactional database by capturing it at one 
particular time. 

Stakeholders: “Individuals or groups who may affect or be affected by program 
evaluation (Joint Committee on Standards for Educational Evaluation, 1994).” 

Transactional Database: A live database used to conduct interactions between 
humans and electronic databases, e.g. registration system. 

Triangulation: “The use of multiple sources and methods to gather similar 
information (Joint Committee on Standards for Educational Evaluation, 1994).” 

Unit of Analysis: “The least divisible element on which measures are taken 
and analyzed (Joint Committee on Standards for Educational Evaluation, 1994).” 

Unofficial Data: Data that may not necessarily be replicable. 

Utility: “The extent to which an evaluation produces and disseminates reports 
that inform relevant audiences and have beneficial impact on their work (Joint 
Committee on Standards for Educational Evaluation, 1994). 
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FIGURE 1 
Expectation Exercise 



From Regional State University 
RESULTS FROM 

THE ACADEMIC LEADERSHIP RETREAT 2001 



National Survey of Student Engagement Question: 

In your experience at your institution during the current school year, 
about how often have you done each of the following? 





Freshmen 


Senior 


Predicted 


Ideal 


Actual 


Predicted 


Ideal 


Actual 


a. Asked question in class 
or contributed to class 
discussions 


1.96 


3.36 


2.69 


2.81 


3.72 


3.32 


b. Made a class presentation 


1.62 


2.68 


2.20 


2.77 


3.46 


2.93 


c. Prepared two or more 
drafts of a paper or 
assignment before 
turning it in 


1.53 


3.24 


2.94 


2.27 


3.42 


2.61 


d. Worked on a paper or 
project that required 
integrating ideas or 
information from various 
sources 


1.95 


3.28 


3.22 


2.74 


3.61 


3.32 



“Predicted” were predicted by a faculty group prior to seeing actual results. 
“Ideal” were projected by a faculty group prior to seeing actual results. 
“Actual” are actual student results from that institution for 2001. 
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FIGURE 2 

Student Services for Online Learners Beyond the Administrative Core 



The purpose of using this “web” in the Data Audit and Analysis Toolkit is to 
illustrate the variety, breadth of and interactions among student services on a 
typical college campus. 

This figure is used by permission fi’om the Western Cooperative for Educational 
Telecommunications Learning Anytime Anyplace Partnership project. The goal of 
that project is to design student services beyond the administrative core. To reach 
a common understanding about what was meant by student services for purposes 
of the project, the partners divided services needed by online learners into five 
clusters or suites: administrative core services, academic services, communications 
services, personal services, and student communities services. 
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Student Affairs Offices and the Types of Data They Might Keep 
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FIGURE 3 (cont.) 
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Academic Affairs Units and the lypes of Data They Might Keep 
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FOREWORD 



The Data Audit and Analysis Toolkit is intended to help those responsible 
for planning and implementing programs focused on the first college year 
to better understand the student experience during this critical period. The 
idea for the Toolkit grew out of our strong conviction that colleges and 
universities in the country typically “don’t know what they know” about 
the first year of college. Most institutions have a lot of data about first- 
year students. But these data are frequently collected by different offices 
for different purposes and are not usually harnessed by faculty and staff 
to paint a comprehensive picture of what is happening to first-year students. 
The Toolkit provides a way to begin exploiting these hidden information 
resources to enhance both experiences and outcomes for students in their 
first college year. 

While the notion of a “toolkit” may at first seem mundane, we view 
this effort in the light of a larger vision provided by Russell Edgerton, 
Director of the Pew Forum on Undergraduate Learning. When Russ was 
leading the education grant-making program at The Pew Charitable Trusts, 
he inspired and funded a remarkable array of improvement initiatives 
for undergraduate education. Some of these, like John Gardner’s work in 
the Policy Center on the First Year of College, were intended to directly 
improve institutional practices. Working in Russ’ words “from the inside 
out,” they were designed to change the way colleges and universities 
do business by applying the best of what we know about what helps 
students learn and succeed. Others, like Peter Ewell’s work at the 
National Center for Higher Education Management Systems (NCHEMS) 
on accreditation and public accountability, were intended to shape the 
broader conditions within which higher education institutions do their 
work. Operating “from the outside in,” they were designed to change 
public conversations about “quality” in higher education, and to create 
and align external incentives for institutions to act deliberately to improve 
undergraduate education. Running through both was the common theme 
of taking active, collective responsibility for student learning and success. 
The Toolkit is but one of many initiatives advanced in this spirit by the 
Policy Center on the First Year of College — which is itself one of some 
forty individual projects that are now members of the Pew Forum on 
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Undergraduate Learning. Though the language of the Toolkit is of data 
elements and analysis, a common vision of success and improvement 
inspired its creation and should remain foremost in our minds. 

The specific idea for the Toolkit came up in a speech Peter delivered 
at John’s invitation to the National Forum on Assessment of the First 
College Year, held at the University of South Carolina in February 2000. 
Peter’s central theme in this talk was that college officials usually have 
only limited understanding of the “lived experience” of first-year college 
students — the often highly personal events and milestones that may make 
the difference between leaving an institution and sticking it out. With 
better understanding, educators could establish better policies, build 
better programs, and make better decisions. A second key point Peter 
made was how different and complex these “lived experiences” turn out 
to be. Behind the “averages” of most statistics are myriad real individuals — 
who come to college with different expectations and abilities, and who 
interact with the institution in distinctive ways. The same program may 
thus have very different effects on different kinds of students, and we 
establish “generic” programs at our peril. 

Understanding the diverse experiences of students in their first college 
year demands better information than most institutions can currently 
lay their hands on. A good first step is to identify, inventory, and round 
up the data that your institution already has about first-year students. 
Capitalizing on NCHEMS’ experience in conducting “data audits” of this 
kind, we enlisted the help of ten pilot institutions to help us try out the 
concepts embodied in the Toolkit. Karen Paulson of NCHEMS took the 
lead in drafting the document and incorporating the lessons learned from 
the pilot institutions. Mike Siegel of the Policy Center did yeoman service in 
recruiting pilot schools and in coordinating the review and implementation 
process. Based on the experiences of these pilot participants, institutions 
can benefit significantly from taking stock of their existing information 
resources on the first year of college. Any strategy for improvement, 
though, should utilize multiple measures in addition to the student-record 
information that the data audit will reveal. Prominent candidates for such 
additional measures are two data-col lection approaches also underwritten 
by Pew — the National Survey of Student Engagement (NSSE) and the joint 
Policy Center and UCLA Higher Education Research Institute’s survey. 
Your First College Year. But whatever the approach taken, institutions 
should be as proactive and creative as they can be in seeking multiple 
sources of information about how students experience and negotiate their 
critical initial encounter with college. 

The information that results from this exercise has many uses. Most 
important, of course, better understanding can lead to program Improvement. 
Specific knowledge of what works, for whom, and under what circum- 
stances can help those responsible for first-year programs to design better 
interventions and experiences, tailored particularly to the needs and charac- 
teristics of different kinds of students. The same kind of information can 
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help educators evaluate the effectiveness of these interventions and, if 
they are proven effective, can help them argue for continued funding in 
those tight budget years that seem to be all too common these days. 
Building the databases needed to understand the first year of college also 
positions institutions to gradually extend the coverage of their information 
resources to address the entire undergraduate experience. Concentrating 
initially on information to improve first-year success can thus address a 
prominent problem faced by many colleges and universities while it 
simultaneously provides the foundation for a more comprehensive campus 
assessment effort. 

But most important of all as you begin to use this Toolkit is to remember 
the original vision: increasing the success and academic performance of 
the diverse array of students who attend our many institutions. They and the 
public depend on us to provide the effective academic programs and support 
services that can help them fulfill their rich and unique potentials. 



Peter Ewell and John Gardner 
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INTRODUCTION 



Why a Data Audit? 

The basic objective of a data audit is to identify and inventoiy data sources 
and needs across the campus. Information derived from the audit can then be 
used to design and create a flexible analytical database suited to conducting 
a range of analyses about the first year of college on an on-demand basis. 
Put simply: A data audit allows an institution to periodically and system- 
atically take stock of, and then mobilize, its data resources. All colleges 
and universities should consider conducting a data audit with regard to 
the first year of college in order to accurately assess the implementation 
and impact of the first year on students, faculty, and staff. If an institution 
chooses, data audits can be expanded to include the entire institution and 
data about students at all levels. 

A fundamental shift of perspective is required to assess the implemen- 
tation and impact of the first year. Determining “what happened” and “what 
mattered” during that year involves moving from a cross-sectional to a 
longitudinal perspective. Data contained in live transactional databases 
such as admissions or registration systems, by their very nature, change 
every day. Therefore, using such data directly to examine students and 
their behavior analytically has many drawbacks. Instead we need to capture 
“snapshots” — that is, freeze the data, containing carefully defined subsets 
of these data at periodic intervals and archive them for later analysis. 
These subsets of data can be used in combination to provide a model of 
student movement through the curriculum and institution. Determining 
which particular data elements to capture in this manner — and where they 
can be found — is a primary objective of the data audit. Often data are found 
organized in databases by type of data or survey or survey administration. 
What we really want for analysis, though, are data organized by student — 
analogous to a transcript that assembles data about what happens to them 
over time. Data of this kind enable us to investigate the first-year student 
experience to examine such items as patterns of retention and interrupted 
enrollment, the order in which courses are taken and completed (or dropped), 
and any association between academic success and participating in particular 
kinds of programs or interventions. 



data audit 
allows an institution 
to periodically and 
systematically take 
stock of, and then 
mobilize, its data 
resources. 
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Conducting a data audit and creating a database, however, are not ends 
in themselves but activities in an ongoing process, designed to enable 
campuses to more effectively understand and improve the experiences of 
their students in the first year of college. Examining patterns of student 
behavior and the effectiveness of first-year programs, therefore, is as 
much a matter of attitude as it is of technique. A key point here is simply the 
commitment to improve. Institutional commitment, supplemented with the 
flexibility and latitude to make changes in first-year programs and activities, 
will make a difference to students. Individuals involved in first-year-of- 
college programs should be continually encouraged to ask empirical ques- 
tions about performance and effectiveness, and to back up their opinions 
and anecdotes with facts. It is appropriate to ask: ''Is this an empirical 
question that can actually be answered and supported with some data?” 

This Toolkit is based on the premise that it is important to conduct a 
data audit and data analyses on the entire first year of college. This requires 
bringing together data already gathered and used, as well as data that are 
collected and unused, to get a holistic understanding of the first year of 
college, rather than focusing on separate activities, experiences, and classes. 
The Pew Charitable Trusts and The Atlantic Philanthropies generously 
supported the Policy Center on the First Year of College and the National 
Center for Higher Education Management Systems as they developed the 
documents and conducted the pilot study for the Toolkit. A call for partici- 
pation in the pilot study yielded nineteen applications. From these, staff 
chose ten institutions to represent a range of institutions: Augustana College 
(IL), The University of Minnesota-Duluth, Ohio University, Northeastern 
State Technical and Community College (TN), The University of Texas- 
E1 Paso, University of Cincinnati, Lynchburg College (VA), Blue Ridge 
Community College (VA), Santa Fe Community College (FL), and 
Washington State University. Input from this diverse set of institutions has 
strengthened the Toolkit and made it more applicable in a variety of settings. 

The Technical Manual of the First Year Data Audit Toolkit is designed 
for use by both technical personnel who will be conducting the data 
audit and associated analyses and the administrators who want more 
in-depth information about data audits. The Technical Manual begins 
with the same chapters and sections found in the companion document. 
The Administrative Rationale of the First Year Data Audit Toolkit. First, 
a general overview explains the importance of a data audit focused on 
the first year of college. The next section briefly outlines how to foster a 
culture of evidence on campus and some tips for creating a "data-based 
dialogue” with various campus constituencies, followed by an outline of 
what is involved in conducting a data audit. The Technical Manual then 
continues with a set of recommendations for a "common core” of data 
elements that institutions should consider assembling and maintaining 
in order to conduct analyses of the first year of college. This section is 
followed by a short discussion about the construction of longitudinal 
student databases. Finally, the Technical Manual concludes with the kinds 
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of data analyses that might be used to illustrate what is happening in the first 
year of college. A range of standard reporting templates are provided as 
an associated appendix. The companion, Administrative Rationale, con- 
tains only the beginning sections and is targeted for academic affairs or 
administrative affairs administrators in order to build an argument for 
conducting a data audit on campus. 



CREATING A CULTURE OF 
DATA USE 



Conducting a data audit and creating a database for analysis are not 
ends in themselves but activities in an ongoing process designed to enable 
campuses to more effectively understand and improve the experiences of 
their students in the first year of college. Examining patterns of student 
behavior and the effectiveness of first-year programs, therefore, is as much 
a matter of attitude as it is of technique. A key point here is simply the 
desire to improve — and the flexibility and latitude to make the kinds of 
changes in programs and activities that will make a difference. Campus 
leaders need to: 

• Foster this attitude continually, 

• Allow people “in the trenches” the discretion to change what they do, 

• Encourage active and ongoing participation of the faculty, staff, and 
students from across the institution, 

• Encourage as much use of public records and open access as is possible, 
given confidentiality guidelines, 

• Support institutional faculty, staff, and administrators with data and 
appropriate resources, and 

• Visibly celebrate their efforts and successes. 

A second key point is to remember that the “right” things to do during 
the first year of college are not just matters of opinion and debate, but can 
be investigated concretely with real data. As a result, all people involved 
in first-year-of-college programs should be continually encouraged to ask 
empirical questions about performance and effectiveness, and to back up 
their opinions and anecdotes with facts. In other words, whenever some- 
body is tempted to assert, “X is happening” or that “Y is the case,” (s)he 
should always pause to consider, “Is this an empirical question that can 
actually be answered and supported with some data?” 

When seeking to build a culture of data use on campus, it is also important 
to bear in mind the many different ways in which people use information. 
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Most researchers or institutional analysts tend to adopt the rational perspec- 
tive on data use, which assumes that those individuals running programs 
want information to make decisions. And, indeed, that is often the case. 
Real decisions must be made in first-year programs about such matters 
as whether to continue with particular program components, how much 
to invest in various activities, and how to establish priorities for serving 
specific types of students. It is equally important, though, to be aware that 
information serves a variety of other functions in any organizational setting. 
Among the most prominent of these are: 

• Problem Identification . Sometimes data are useful simply to signal 
the fact that a problem exists that needs to be further investigated. In 
this regard, establishing statistical indicators may well be a profitable 
course of action. Monitoring indicators over time (e.g., annually, Irom 
term to term, etc.) can reveal the extent to which progress is being 
made in improving performance, or it can chart important changes in 
student behaviors or conditions. Graphic or visual displays of such 
information are often useful for problem identification because they 
can quickly be scanned for anomalies. For the first year of college, 
for example, useful statistical indicators might include: 

/ First-to-second-term persistence (reenrollment) rate, 

/ Fall-to-fall reenrollment rate, 

/ Percent of first-year students in academic difficulty, 

/ Percent of first-year students requiring and completing develop- 
mental work in basic skills areas (reading, writing, math), 

/ Number of violations of established policies for placing students 
into courses or prerequisite course sequences, 

/ Student/faculty and student/advisor ratios, or 

/ Percentage of courses dropped. 

• Context Setting . Another prominent use of the kinds of information 
generated through a first-year data audit is simply to paint a broad 
picture of what is happening in a particular setting or for a targeted 
population. In contrast to problem identification, in which very 
specific pieces of data are used as indicators that point to an under- 
lying condition or phenomenon, the objective here is to flesh out a 
situation as completely as possible using as much information as 
possible. An example for the first year might include an in-depth 
look at the experiences of male students of color, drawing on data 
about basic patterns of persistence and coursetaking, questionnaire 
data on attitudes and perceptions, and data about participation in and 
reactions to first-year programming. Presentation of such results 
usually emphasizes how the various individual pieces of data fit 
together to yield a comprehensive and integrated “story” of what is 
happening. Consistent with this emphasis, qualitative data drawn 
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from observations and interviews are often used in conjunction with 
statistics — both in order to expand the portrait of experience being 
created and to render the presentation more “real.” 

Informing Discussion . Because academic settings are highly partici- 
patory, decisions are often long in coming and discussions of opinions 
and options are frequently long and arduous. Concrete data are useful 
in such settings to focus discussion and to close off obviously unpro- 
ductive lines of thinking. At the outset, for example, a concrete piece 
of data about a student experience or about the effectiveness of a 
particular program element can generate a far more focused and useful 
discussion of what might be done rather than a vague feeling that 
“something is wrong.” At least as important, using data Judiciously 
can also help guide a wandering discussion and can discipline it so 
that uninformed opinions are less dominant. Committees are a fact 
of life in the academy, and many first-year activities are governed or 
advised by them. Using data to frame and steer committee discus- 
sions in productive ways (away from mere anecdotal stories) can 
thus be especially important. 

Selling Decisions . Decisionmaking is always complex, and decision- 
makers rarely make a decision only on the basis of formally supplied 
information. Additional factors will always include political climate, 
perceptions of potential impact, and a good deal of plain “gut feeling.” 
Nevertheless, given this complexity, data are often useful in explaining 
a decided-upon course of action after the fact. This strategy helps 
mobilize support for the decision, and allows the decision to be easily 
explained to those not involved in making it but whose “buy-in” 
is nevertheless important. At the same time, information can be 
especially important in making a case to funders that a particular 
program or line of work is critical. While seemingly cynical, this use 
of information is nevertheless important in the real world of academic 
decisionmaking and those responsible for first-year-of-college programs 
ignore it at their peril. 



Data are often 
useful in explaining 
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Strategies 



There are also a number of proven tactics for using information in 
productive ways on campus and for getting people involved in looking 
at data. Among the most useful are the following: 



Expectation Exercises . One of the most frequently encountered 
reactions when sharing a piece of information with a campus 
audience is, “I already knew that.” This response may occur because 
individuals want to feel that they grasp situations fully, even though 
they may not have thought much about them in advance. Partly it is 
because the human mind is good at thinking up explanations for 
things after the fact — and thus not being “surprised” by them. But 
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this reaction often gets in the way of acting on information in real- 
world situations. One way to counter it is, before the results are 
revealed, to ask those involved what they think the result of any 
analysis or data-gathering exercise is going to be. (For example, if 
you ask faculty at many regional state institutions what the mix of 
degrees granted in a year might be their answers are often heavily 
weighted toward the liberal arts.) This exercise makes participants 
think concretely about consequences and possible actions from the 
outset. More importantly, it provides a baseline against which the 
actual results can be compared, once they are distributed. (Con- 
tinuing the example, the reality of the degree mix for regional state 
institutions is usually heavily weighted toward business and education 
degrees — professional, rather than liberal arts degrees.) Differences 
between the forecast and the truth often provide a springboard for 
discussions about action implications because people are surprised 
and more likely to then be drawn into the discussion. See Figure 1 
for an example. 

• Discrepancy Studies . Along the same lines, data are often most power- 
ful in generating interest or in starting discussions when they are 
packaged around a discrepancy. Discrepancies can be of many kinds, 
for instance, between: 

/ Expectations and actuality (as above), 

/ Established targets and actual performance, 

/ Aspirations and reality, 

/ Standing policies and real behavior, or 

/ One population group and another. 

But by their very nature discrepancies tend to command more atten- 
tion than just presenting a number. A particularly powerful way to 
start discussions about advising, for instance, is to present data on 
student course-taking behavior that suggest established prerequisite 
policies are being violated and that students are failing subsequent 
courses as a result. 

• Beginning with a Recognized Problem . Most people are not inter- 
ested in data for its own sake. As a result, it is often a challenge to 
build support for a campus-wide project whose sole objective appears 
to be to improve data resources. Instead, it is usually better to begin such 
efforts with a presenting problem that is apparent to everybody — 
for example, widespread academic failure among first-generation 
students, visible shortfalls in quantitative reasoning skills among 
entering students, or uneven teaching quality in multi-section courses. 
Obviously, such presenting problems will be different on each campus 
and cannot be predicted. Indeed, the “demand” side of the data audit 
process is often useful precisely because it unearths such examples. 
Once identified, much of the effort can then be packaged around the 



ERIC 



92 



CREATING A CULTURE OF DATA USE 



9 



need to address such concrete, widely recognized problems rather 
than based on just a vague need for better data. 

• Creating Public Opportunities for Discussing Data . For similar reasons, 
many campuses have found it valuable to create highly participatory 
occasions to discuss the implications of data findings. Such discus- 
sions can involve broad cross-sections of the campus community or 
be limited to those directly involved in running programs and are often 
conducted during non-peak scheduling periods in retreat settings. 
One public university, for example, holds a summer planning retreat 
each year with broad participation from faculty and program staff. At 
the retreat, a few key data findings are presented and participants 
break up into small working groups to brainstorm ideas about what 
might be done in response. Results of these sessions are then shared 
and discussed, and become action priorities for the coming year. 
Many variations on this theme are possible, but all involve present- 
ing selected statistics, then gathering a group of people (including 
students) to discuss their implications. 

» Avoiding Data Overload . Many analysts err in the direction of trying 
to report too much when they present findings — either in report form 
or in public occasions such as those noted above. Analyses should be 
comprehensive and thorough but it is usually better to release a few 
carefully chosen findings, organized around issues or problems that 
are important, rather than present a “data dump.” Answering the 
inevitable questions that a limited set of findings will generate and 
thus initiating a “data dialogue” is the best way to get people hooked 
on information. 

A final point about building cultures of evidence is that action and 
follow-through are the most important conditions of all. Few people are 
interested in investing in information if it is clear that nobody will act on it 
and that nothing will change. Conversely, one of the best ways to promote 
involvement is to actively demonstrate that change is intended and 
possible. As a result, it is frequently useful to undertake reasonably small 
projects at first, where follow-through can be demonstrated immediately 
to potentially doubting constituencies. 
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THE FIRST YEAR OF COLLEGE 



Why Are First Year Data Important? 

The first year of college is a confusing time for students, faculty, and 
college personnel. Whether at a community college or at a four-year insti- 
tution, multiple programs are often in place, offered to different types of 
students, creating multiple experiences with many different types of inter- 
actions. Cause and effect is always an issue. Determining which programs 
and which interactions have beneficial effects for which groups of students 
is often difficult to figure out. We need a lot of data, often from disparate 
systems or offices, collected systematically, and organized appropriately 
in order to conduct such analyses. 

The first year is also a logical place to anchor the development of a 
wider institutional assessment effort. Data collected on the first year of 
college can be the foundation for expanded data use and analyses on the 
entire institutional experience as warranted. Though complex, the first 
year usually consists of a well-delineated set of experiences for an easily 
identified set of students. It is, therefore, a manageable place to start when 
building an evaluation capacity at any institution. Furthermore, it makes 
chronological sense to begin a larger longitudinal study of student experi- 
ence with the first year. Once baseline data about the characteristics and 
experiences of an entering cohort of students are assembled, it is possible 
to continue to capture information about these students throughout their 
academic careers. 

Finally, information about the effectiveness of first-year-of-college 
programs gives program directors an important resource to make the case 
for which programs to continue and target for possible expansion and which 
to discontinue. First-year-of-college programs often comprise politically 
fragile and specially-funded activities, so evaluating effectiveness is critical 
to proving their ultimate worth. Data must be presented in ways that facili- 
tate discussions about future investments. From a wider perspective, such 
discussions may simultaneously help to develop a “culture of data use” 
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on campus for the long term that will aid not only first-year but other 
activities as well. 



Institutional Questions About the First Year of College 
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How should we analytically untangle the many elements of the first 
year of college and dissect what makes it work? Underlying this master 
question are four more focused questions having to do with: 

A. What is planned for the first year of college? 

B. Who is involved in the first year of college? 

C. What happened (and where) during the first year of college? 

D. What mattered (and why) during the first year of college? 

A. What Is Planned for the First Year of College ? 

An initial question to be asked has to do with identifying the objectives 
of the first year of college at your institution. Even more basically, one 
might ask whether the first year of college is conceived as an integrated 
and intentional set of experiences that students are actively advised 
through and participate in. In initially establishing first-year-of-college 
programs, most institutions will have already answered this question in 
the affirmative. Given the existence of a “program,” though, are its objec- 
tives clearly defined? Like learning outcome statements for a curriculum, 
it is important to define the objectives of first-year activities specifically 
in terms of: 



• How individual students will be different, 

• When that difference is expected (initially or after participation is 
complete), and 

• What students will be required to do in the first year of college. 

Instead of being defined for individual students in this manner, 
program objectives are often framed more generically in terms of what 
the institution will do or what the institution wants to happen for the 
student body as a whole, or perhaps what an institution wants to happen 
for an identified group of students. Defining objectives for a student body 
as a whole rather than for individual students should be avoided because 
it is far less useful in providing guidance for assessment and evaluation. 
Who should be involved in designing learning objectives for the first 
college year? Stakeholders to be involved would probably include student 
affairs professionals, departments and faculty teaching first-year courses, 
and residence hall staff where appropriate. Once objectives for the first 
year of college are defined, then it is necessary to clarity their meaning 
and implications with the groups responsible for the various activities. 

A primary objective of first-year-of-college programs is to ensure con- 
tinued student success. It is important to emphasize that proof of this 
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objective is always found after the fact. It is manifested in what happens 
next for the student at the institution and within the curriculum, for example, 
persistence and ultimate graduation, actual levels of student performance 
in subsequent academic coursework, and the achievement of particular 
learning outcomes. 

Potential outcomes for students in their first year of college include, 
but are not limited to: 



• Developing foundational academic skills such as quantitative, writing, 
speaking, technology or information literacy skills. 

• Learning how to “negotiate” college and the collegiate culture. 

• Managing academic life and good practices such as what constitutes 
scholarly work and the difference between primary and secondary 
sources. 

• Developing appropriate non-cognitive abilities and attitudes like 
motivation, self-worth, and respect for others. 

• Learning how to balance academic work with social life, and often, 
family responsibilities. 

• Developing approaches to critical thinking and problem solving 
appropriate to a variety of academic disciplines. 

Each of these possible outcomes suggests a particular evaluative line 
of inquiry and a specific set of data sources that might be tapped. In 
addition, the first year of college is often a testing ground for innovative 
practices that might be extended throughout the college experience if they 
prove effective. Examples of such practices include peer mentorship and 
collaboration, problem-based learning, and hands-on engagement with 
subject matter. Given their potential wider significance, it is always wise 
to evaluate the impact and effectiveness of such innovations in some 
detail. 
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B. Who Is Involved in the First Year of College ? 

It is important to identify the specific characteristics of the students and 
faculty who participate in the first year of college. While we may think 
we know our students well, we often harbor unexamined assumptions 
about their backgrounds, attitudes and capabilities. For example, we will 
probably want to know a good deal about the following: 

• Student demographic characteristics like gender, race and ethnicity, 
age, disability status, family background, and whether students’ parents 
attended college. 

• Previous educational experiences and achievements of first-year 
students. 
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• Student educational and career aspirations, attitudes toward attending 
college, and areas about which first-year students are apprehensive 
or expect to encounter difficulties. 

• Characteristics of the faculty and staff who work with first-year 
students including demographics, professional background and 
experience, and what they expect of students. 

These factors can often interact with one another in complex ways to 
create specific populations of students and staff. For example, knowing 
that at an institution the ‘‘average age of an incoming student is 25” 
often masks the fact that there may be two distinct populations — one of 
18-year-olds and the other composed of more mature students — who are 
likely to behave very differently. While such issues might simply be a 
problem of data presentation, they can also be an “institutional myth” that 
could be addressed by further data disaggregation. 

The key is to always remember that real students, faculty, and staff, 
who bring a broad cross-section of diverse experiences and perspectives 
with them to the institution, populate the first year of college. Unless we 
know a good deal about these experiences and perspectives, it will be 
hard to figure out what is going on. 

C. What Happened During the First Year of College, and Where Did 
It Happ en? 

The question of what actually happened to students during their first 
year of college is rarely asked systematically. Instead, we tend to assume 
that all first-year programs were implemented as planned and that the 
experiences of all students were uniform. But this is frequently not 
the case. Some experiences are planned and explicit while others are 
spontaneous, amorphous, and random. An operational mantra that should 
therefore continually be kept in mind is, “Adopt the student’s point of 
view.” This essential change of lens from our perspective to the student’s 
perspective is critical to determining what really happened to whom. It 
requires not “looking at” students but instead “looking through” students’ 
eyes to determine the actual behaviors they engage in when they encounter 
and act out the programs we put in place, as well as what experiences they 
brought with them to the programs. Sometimes the only way to get the 
answers to such questions is to “walk the process” by putting yourself in 
the student’s shoes and duplicating and documenting each step directly. 
For example, one such analysis at a large university revealed that students 
were often missing the first ten minutes of several of their classes simply 
because they could not get across campus from their last class fast enough 
to show up on time. 

By adopting the student’s perspective, most people discover that what 
actually happens to students in the first year depends a lot on the successful 
implementation of programs and courses as planned. However, few activities 
or programs are actually implemented as planned. Programs often show 
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little impact when evaluated because they were never successfully imple- 
mented, not because they were inherently ineffective. For instance, if a 
part of a first year program centered on a particular instructional strategy 
(attending a ropes course or use of a new software product) that was not 
available until halfway through the term, that is an implementation 
problem. As Joan Stark, professor at the University of Michigan, has 
pointed out, there are always significant differences between the design, 
the delivery, and the resulting student experiences associated with any 
curriculum (Stark and Lowther, 1986). Therefore, it is necessary to look 
for what interfered with full implementation or what situations arose that 
altered the original implementation plan. 

Three specific syndromes common to the implementation of any 
program, including those in the first college year, often contribute to this 
problem and should be anticipated: 

• Piecemeal development of programs and program elements that do 
not fit together very well. Often this approach results in duplication 
of efforts or gaps in service that are very apparent to students but not 
always obvious to faculty and administrators. 

• Rushing to implement any new design. This situation often introduces 
a good deal of unintended variation in the way programs are imple- 
mented across departments, units, or locations — resulting in uneven 
(or even contradictory) effects. 

• Adoption of a “true believer” stance that assumes automatically that 
certain things must be effective (e.g., small classes, collaboration in all 
circumstances, etc.). This attitude is often an admirable characteristic 
of programs about which people care deeply, but unexamined assump- 
tions about effectiveness may mask real difficulties in implementation 
or design. 

All three syndromes suggest devoting much more attention to ques- 
tioning our assumptions about first-year programs from the outset. In 
contrast to what you think might have happened, it is always wise to 
check these assumptions out with real data. 

D. What Mattered During the First Year of College, and Why Did It 
Matter ? 

The question of impact, of course, is ultimately what we want to get to 
in any analysis. Hopefully first-year experiences result in identifiable and 
beneficial changes in behavior, attitudes, and cognitive abilities that are 
consistent with program goals. The analytical task associated with 
answering the question “What mattered?” is to look for longitudinal paths 
of student learning and development through the curriculum and 
extracurricular activities that are consistent with the individual student 
outcomes that you want to achieve. This task requires an essential shift of 
perspective from a “still photo/snapshot” view of college life to a “moving 
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picture” perspective that emphasizes development and attainment. Doing 
so enables us to look for different patterns of student movement and flow 
through the college experience that are created by interactions among 
the formal curriculum, co-curricular activities, and students’ own extra- 
collegiate experiences. Taking this perspective introduces many behavioral 
questions that need to be addressed, such as: 

• In what order do students take particular classes and co-curricular 
activities, and how frequently do they participate in particular 
experiences? 

• Do students actually follow the advice given to them in advisement, 
and what difference did it make? 

• What kinds of experiences mattered most for what kinds of students 
in terms of cognitive or affective development? 

Ultimately, of course, the question of “what mattered” needs to be 
addressed in terms of intended outcomes and program objectives — ^which 
is why it is so important to be precise about these in the first place. The 
first year of college may also have many unintended or unplanned conse- 
quences for students, both for the better and for the worse. As a result, it 
is always wise to build flexibility into databases and analyses with the 
expectation that the unexpected will happen. 

Given these demands for evidence to document the first college year — 
and the failure of most institutions to systematically determine who is 
involved, what is happening, and what mattered in this period — it pays to 
be systematic about assembling data resources. Techniques for doing so 
are the central concern of this Toolkit. Going beyond technique, the basic 
mindset of questioning assumptions and of constantly posing and 
addressing the four basic questions discussed in this section — what, who, 
what happened, what mattered — will always be helpful. 
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What Is a Data Audit and Why Do It? 

Data Audit: The process of identifying data resources and uses 
wherever they may be within an institution and gathering 
them into a useable information system. 

The basic objective of a data audit is to identify and inventory data sources 
and needs across the campus. Information derived from the audit can then 
be used to design and create a flexible analytical database suited to con- 
ducting a range of analyses about the first year of college on an on-demand 
basis. Such a database is most useful if it is separated from the regular 
student information system kept by the registrar. By their very nature, the 
data contained in live transactional databases — like admissions or regis- 
tration systems — change every day. Therefore, using such data directly to 
examine students and their behaviors analytically has many drawbacks. 
In order to move from a view of students based solely on glimpses at 
the student information system, we need instead to continue to capture 
“snapshots” of student data that contain carefully defined subsets of data 
at periodic intervals and archive them for later analyses. These analyses will 
often require using these “snapshots” in combination to create a “moving 
picture” that approximates student movement through the curriculum. 
Determining which particular pieces of data, or data elements, to capture 
in this manner — and where they can be found — is a primary objective of 
the data audit A summary of data audit steps is included at the end of this 
section. 

Put simply, a data audit allows an institution to take stock of and then 
mobilize its data resources. All colleges and universities should want to take 
this action with regard to the first year of college for the reasons presented 
in this document: a) “generic” programs are seldom useful for real (and 
therefore different) types of students; and b) factors that affect one sector 
of the student population may not affect another, resulting in differing 
implications for both policy and intervention strategies. The capability to 
analytically disaggregate the student population to determine what works 
for whom is therefore critical. 
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Elements of a Typical Data Audit 

A data audit consists of two primary activities: 

« Examining existing data sources at the institution wherever these 
may be found, and 

® Determining those data that are most critical for evaluation, assess- 
ment, and decisionmaking needs. 

These two activities can be thought of as building campuswide under- 
standing, respectively, of the “supply” of data and the “demand” for data. 
Conducting a data audit thus involves identifying data sources, creating 
data inventories, and documenting data collection methods and routines 
already in place. Examining management and decisionmaking needs, in 
turn, requires determining schedules and formats for submitting data or 
information to external constituencies (e.g., accreditors or the state) and 
determining whether there are management needs for information that are 
not currently being fulfilled. 

While there are many different ways to conduct a data audit, these two 
overarching purposes — to determine data sources and data needs — should 
always guide what is done. Once completed, the information gathered 
during the data audit can be used to help restructure current management 
information systems. It can also assist you in locating additional points 
of contact with key constituencies (students and faculty) that might be 
better used to collect pertinent data. Above all, results of a data audit 
provide the basic ingredients needed to create the database (or databases) 
required to conduct ongoing in-depth analyses of the effectiveness of the 
first college year. 



Who Should Be Involved in Planning and Carrying Out 
the Data Audit? 

A data audit can be conducted by individuals or groups, but usually 
proceeds under the guidance of an institutional or unit-wide committee. 
Participants typically consist of institutional researchers, academic planners, 
student affairs professionals, student advisors, faculty and administrators. 
In addition, it is usually wise to have different perspectives represented 
on any team that either conducts or oversees a data audit. Involving indi- 
viduals who are directly familiar with particular data sources because they 
use them every day — like people from the registrar’s office or institutional 
research — is always beneficial. It is also useful to involve some people 
who are entirely removed from data processes — for example, some 
student advisors, faculty or administrators. Such individuals will often 
benefit the data audit by bringing fresh perspectives to bear on the process, 
and they will benefit directly from knowing how particular kinds of data 
are kept at the institution. At the same time, they will acquire greater 
sensitivity to the fact that the information demands that they often make 
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can be technically challenging or, under current conditions, impossible to 
meet. Also, those individuals who gather and maintain data will see that 
their information is important and will be used by others; therefore, they 
may make more of an effort to keep their data well maintained. 

The person or persons chosen to lead the committee should have broad 
support on-campus, particularly from upper-level administrators, and 
have a clear understanding of the purpose of the data audit and analyses 
for the first year of college. Often institutions have co-chairpersons — one 
with strengths in either academic or student services and the other with 
strengths in technical areas. 



When Should A Data Audit Be Done? 



Data audits are usually done on the occasion of some other major 
activity. These occasions can include (but are not limited to) accreditation 
self studies, consideration of new transactional systems (notably student 
systems, but also personnel systems), when building a data warehouse or 
data mart, or when new assessment personnel, institutional researchers, or 
first year coordinators are hired. Although many institutions find it useful 
to conduct a data audit as a part of or in support of these larger activities, 
it is not necessary to do so. A data audit can be done Just because it seems 
like a good time to find out what data exist on campus and where they are 
located. Having said that, a data audit does not need to be conducted 
every year. It often works out that a three- or five-year cycle is sufficient. 
Among pilot institutions, universities preferred a 5-year cycle for data audits, 
and community colleges, because circumstances change more frequently 
there, preferred a 3-year cycle. Subsequent data audits can use results 
from the first data audit as a foundation. 



The Right Attitude 

A fresh perspective and an open attitude are important when people 
conduct a data audit at an institution with which they may be well 
acquainted. One advantage of having internal personnel carry out the 
audit is that they will already know many of the vagaries of existing 
systems. There can be disadvantages to using “insiders,” though, 
including blindness to the existence of unofficial data, unwillingness to 
listen to other viewpoints, and an inability to probe deeply and consis- 
tently to determine whether data are defined differently in different places in 
the institution. Those involved in data audits should therefore constantly 
monitor their own assumptions and viewpoints to avoid these pitfalls. 

Similarly, conducting a data audit typically uncovers a range of 
attitudes on the part of those who collect and keep institutional data. 
Some will be eager to show what they have and will be happy to work 
with you to determine how a wider range of people on campus could 
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better use the data for which they are responsible. Others will be highly 
protective of the data for which they are responsible, and may view audit 
questions as a threat to their functions and independence. An audit team 
needs to be aware that there are sometimes good reasons for this attitude. 
For example, free access to some data (e.g., health records or financial aid 
information) may violate privacy guidelines, and keepers of these kinds of 
data can get into trouble (and even be prosecuted) if they allow unlimited 
access. Others may fear that people unfamiliar with how data elements 
are collected, defined, and constructed will misuse the data. Still others may 
simply be protecting their autonomy, or covering up poor performance. In 
all such cases, be sure to listen carefully to their concerns, understand 
what really lies behind them, and make appropriate compromises. 

A Note About Confidentiality 

Student data are confidential. The Federal Educational Rights and 
Privacy Act (FERPA), also known as the Buckley Amendment, protects 
individually identifiable data from public scrutiny. In the course of a data 
audit, no individually identifiable data need to be or should be shared with 
others. The focus of a data audit is on the overall databases and their data 
elements, not on any specific individual data kept in those databases. If 
you are unclear of how your institution and state enact FERPA, consult 
with the institutional researcher on your campus. They will be well versed 
in what is allowed or not. No part of this data audit will require you to 
engage in any violations of FERPA. 

The Supply Side 

Official and Unofficial Databases 

Keep in mind that there are often two basic kinds of databases at any 
institution: official and unofficial. Usually “official” data — that required 
for federal or official institutional reporting to the state — is centrally 
maintained and kept and “unofficial” data is maintained and kept by 
decentralized units. Many units gather data to address their own internal 
needs and to meet unique or special reporting requirements. At larger 
institutions such “guerrilla databases” are often kept in unit-level com- 
puter systems rather than in official mainframe database files. Prominent 
examples include advising data, assessment data, placement data, and 
responses to student questionnaires — either institution-wide or specific to 
a unit or program. Or, in the case of qualitative data, student writing sam- 
ples may be kept in electronic portfolios or as hard copies kept in filing 
cabinets. It is, therefore, important to look especially hard for these unof- 
ficial data sources when conducting a data audit, in order to make sure 
that key data elements are not overlooked. To uncover such sources, visit 
departments and units in person to ask about what data are kept and 
reported to external constituencies. 
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Types of Data 

Data are gathered by multiple units and for multiple purposes through- 
out an institution. An illustration of possible student services units and 
offices that might have data relevant to the first year of college is 
provided in Figure 2. Figure 3 lists the types of data about first-year 
students and their experiences that are typically kept by the principal 
student services offices and units listed previously in Figure 2. Note that 
there is some duplication and overlap in this listing because it is typical 
for different offices at an institution to collect the same kinds of informa- 
tion independently. For instance, the Testing Office, as well as the English 
department, may keep English placement data; the Counseling office, as 
well as Admissions, may keep information on parents’ education. Where 
this is the case, it is important to determine if they do so consistently and 
to then document any differences. Since the first year is influenced by 
both student services and academic affairs, a similar listing of pertinent 
academic affairs offices and the types of data they might collect can be 
found in Figure 4. 

Note also that these lists are far from exhaustive. Not all of these data 
may be gathered at your institution, your institution may gather additional 
data, or the data listed may be collected by offices different from those 
listed in Figures 3 and 4. But with these caveats. Figures 3 and 4 can be 
used as protocols for looking for particular kinds of data when conducting 
a data audit. 

Transactional Data 

A data audit will also allow you to uncover and capture transaction- 
based data that are regularly collected by a unit to monitor its ovm operations. 

This so-called “footprint” data is gathered from students as they move Footprint data 
through and utilize a variety of units on campus. Examples include data 
on bookstore and food service usage or data on student contacts with and 
utilization of counseling, advising, or tutorial offices. Cataloging this 
kind of footprint data makes it available for wider use and analysis and 
may eliminate the need to collect information about utilization via surveys 
or other special sources. Transaction-based data also have the advantage of 
being more complete than survey data because they are usually available 
for the entire student population affected, reducing the kinds of sampling 
or response-rate problems associated with using special-purpose ques- 
tionnaires. The main disadvantage of footprint data is that they may not 
be about the topics that really interest you. Furthermore, they are often 
kept in intractable or inaccessible formats and places. You will discover the 
degree to which this is true at your institution while conducting the data 
audit. Even if such data are not eventually tapped for analysis, it is impor- 
tant to know that they exist and whether they are being kept consistently 
with one another with official institutional definitions. 
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A data audit of the first year needs to be limited in scope. It must focus 
on what occurs during the first year of college. Although some people 
may be interested in looking more in-depth at the preadmission informa- 
tion such as data on first contacts, etc., that is not necessary unless an 
institution wants to include enrollment management data in their 
analyses. 



Who Collects Data and Why? 



data audit of 
the First year needs 
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during the First 
year of college. 



The next point to determine is which units keep which data. Units and 
offices scattered across the institution often keep similar data. More often 
than not, they are unrelated to one another, cannot be linked, and may be 
based on slightly different definitions. Similarly, units may analyze data 
in different ways to achieve different ends. During the data audit each of 
these points needs to be documented. If you find that multiple units keep 
virtually the same data but collect it independently, the institution might 
want to consider establishing a centralized method for addressing 
common needs more efficiently. By doing so, consistent definitions can 
be used across campus and the burdens of duplicative data collection can 
be minimized. 



Units and ofTices 
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Actually conducting the “supply side” of the data audit involves 
physically visiting each office or location that collects or maintains data, 
using Figures 3 and 4 as guides. Directors of offices and units should be 
apprised of the data audit and why it is being conducted, but often it is 
associate directors, data analysts, or researchers in an office who know 
the details about data. Oftentimes, it might seem easier to send out 
a survey or an email inquiry with these questions, but we advocate face- 
to-face interviews in individual’s offices — “walk throughs” — for the 
following reasons: 

• It creates a collaborative atmosphere for the sharing of data and data 
sources. 

• It honors office personnel and the importance of their efforts. 

• It indicates an interest in office personnel and what they are doing. 

• It builds a relationship with these individuals and with the office. 

• It allows you to read reactions from individuals and see the office set up. 

• It allows you to do immediate follow-up and collect artifacts. 

• It also allows for “serendipitous” meetings and discussions about 
data and databases, including guerilla databases. 



When you visit each administrative office, academic department, or 
unit, it is important to determine: 

• What kinds of records, data, and databases it keeps on first year 
students, programs, experiences, and activities. 
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• How data collected are used by the unit. 

• What schedules govern when data are collected, and if and when data 
are entered into computer systems. Extracts from live databases are 
often taken on the tenth day of a term and a given time period (such 
as one week) after the end of the term. 

• What surveys it administers, to whom (all first-year students or a 
particular subset of first-year students), and on what schedule. 

• What additional local data collection efforts it engages in with 
regard to first year students, courses, programs, experiences, and 
activities. 

® What questions the unit would like to be able to answer about the 
first year of college. What data would be needed that are not now 
collected. 

• What are unit staff perceptions about gaps in the data and informa- 
tion that they possess on the first year of college. 

• The extent to which available first-year data sources and databases 
are underutilized, and whether unit personnel have ideas about why 
this might be the case. 
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Furthermore, while conducting the audit you need to ascertain across 
units: 

• Whether the records, data, and database structures that these units 
and offices maintain differ from one another, and exactly how they 
differ. 

• The extent to which definitions for common data elements vary 
across units and departments. 

• The extent to which formats in which common data elements are 
kept vary across units and departments. 



While conducting the audit, it is often helpful to collect documentation 
about the data that each unit controls. Artifacts or documents to consider 
collecting from units when you visit them include: 

• The actual forms or questionnaires used to gather and record data. 

• Data element dictionaries. 

• Data element definitions (if not included in the data element dictionary). 

• Database structures and file formats used to archive data (for example, 
are data kept on CD-ROMs, in mainframe files, in Access databases, 
in old 80 -character length fields). 

• Notes on specialized software needed, if any, to access and use data. 
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Notes about the actual coverage, timing, and completeness of the data 
should be organized by type of data, following the logic of Figures 3 and 4. 
Forms, artifacts, and documents should be numbered and keyed to the 
text of the data audit report. 



How Complete Are the Data? 
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The completeness of the data gathered by the institution and its individual 
units is critical. Data on a given topic are sometimes collected for only a 
portion of the entering student body — from those who attend orientation, 
who came to class on a particular day, or whose admissions files came 
through the regular admissions process, for example. It is, therefore, 
important to follow up with units about completeness by asking them the 
following types of questions: 

• Are individual students required to fill out and answer all of the data 
elements on every form — either paper or online — such as admissions, 
registration, and housing forms? Or, when applicants fill out admis- 
sions forms are they told to fill out only certain information on the 
sheet? 



• Are data elements transferred from paper or online forms into data- 
bases? Who does this? Do data entry clerks do it? Are the forms 
scanned automatically? Does the system load online entries directly 
as data elements into a database? What is the schedule for accom- 
plishing these entries? If data entry is done by hand or if forms are 
scanned, are critical data elements entered immediately and other, 
less critical data elements entered later in the term when there is less 
pressure? What happens to forms after data are entered? Do any check- 
points in the system exist to ensure that all data entry is completed? 

• Are there fields that are never entered into the database at all, even 
though the information is supplied or forms filled out? 

• Are individuals asked for detailed information on a form, but then 
upon data entry is the relevant data element collapsed into a “Yes/No” 
or other summary format? 

• Are data elements entered directly into the live student information 
database or are they entered into an intermediate database (e.g.. Access 
or Excel) and then loaded? What office does this? 

Some institutions also find it helpful to run frequency checks — a 
summary of what numbers actually populate the fields and how often they 
each occur — of individual data elements that are not often used in order 
to determine directly the extent to which all students have entries and how 
error free these entries are. Using a simple example, a frequency check on 
the field listing “gender (or sex)” might contain Ms and Fs in addition to 
Is and 2s. A frequency check would also give an indication of how many 
persons in the file had no record of their gender. Sometimes, for example, 
computing center personnel will say that they “maintain” a given data 
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element but later probing will reveal the fact that nobody loads data into 
the field any more, or that the database fields contain unusable data. 

Where Do the Data Go? 

Once it has been determined which data elements each particular unit 
gathers, the next step is to determine where data elements go after they are 
collected. For each data element (or group of data elements), ask personnel 
in pertinent units: 

• Which databases do these data elements go into? Are certain data 
elements put into multiple databases? 

• How are these entries in other databases updated? On what schedule 
and who is responsible? Are old values over-written in this process? 

• Are fields used for multiple purposes? Are different offices using 
supposedly “unused” fields for different purposes and including their 
own data elements and codes? 

• What definitions are used for various data elements used in multiple 
databases? 

• Who has authority over these databases? 

Results of this portion of the data audit are often best documented in 
terms of a map or flow chart showing clearly how and when particular 
data elements move from point of collection to the various places where 
they are archived or used. 

“Walking the Process ” 

In order to accomplish these various steps, it is frequently useful to 
physically “walk the process” of collecting data. One way to do this is to 
adopt the student’s (or faculty member’s) perspective and go through each 
step that has to be accomplished in order to complete a particular action — 
to register for a class, or to obtain financial aid, for instance. Determine 
the specific forms that students have to complete for which units across 
the campus in order to attain their objective. Follow up on each data 
element (or group of data elements) using the questions listed above. 

A pilot institution cleverly combined this aspect of the data audit with 
their ongoing institutional commitment to service. Staff members selected 
actual students to go to specific offices to “walk the process” to collect 
data and information for the data audit as well as to gather information 
about how well they were treated and experience the customer service 
skills of the personnel in the various offices visited. 

Another way to “walk the process” is from a data element’s point of 
view. This will allow you to determine which units gather particular data 
elements (and identify any redundancies), which database(s) particular 
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data elements are kept in, what definitions and categories are used in 
which databases, who is responsible for each data element, and who is 
using that data element to what end. 



Supply Side Summary: What Is Important to Gather During this 
Process? 



When gathering information about data on campus, it is important to 
collect as much documentation as possible about data and databases that 
exist. The following types of support documentation will be especially 
useful: 
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• Copies of forms, both paper and online. 

• Data element dictionaries for databases. 

• Data element definitions (if not included in the data element 
dictionary). 

• Documentation on the structure of databases and the format(s) in 
which individual data elements are kept. 

• Information on historical database files and how many years of data 
are available. 

• Security guidelines and change procedures for all of the databases 
encountered. That is, who has access to the data and who has authority 
for updating or changing the database or its data elements? 



The Demand Side 

The other aspect of doing a data audit is to determine what data needs 
exist on your campus. This aspect is best conceived of as the “demand” 
side of the analysis, complementing the “supply” side represented by the 
inventory of existing data sources. In addition to talking with individuals 
about the data they collect, you will need to talk with institutional decision- 
makers and other data users. Often there is considerable overlap among 
individuals and offices that are data users and data collectors, but do not 
assume that there is. 

Offices, units, and individuals on campus that need to be contacted 
about data needs include academic affairs personnel — the provost, deans, 
and department heads — student affairs personnel, as well as individuals 
involved in accreditation studies or who must report information to state or 
federal officials (directors of TRIO programs or teacher education programs, 
for example). Examples of external reporting that may be required would 
be to accreditors, to state agencies, or to governing boards. When you 
visit these offices and units, ask them: 

• Who are the office’s key internal and external constituencies? 
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• What kinds of decisions does the office regularly make, and what 
kinds of information are needed (or desired) to make them? 

• To whom must the office report data and information? 

• What existing reports and data reporting requirements does the 
office have and whether it is able to fulfill them? Working backward 
from existing reports and procedures, determine what data are needed 
and how calculations are made. 

• What kinds of reporting and decision cycles are typical? (For example, 
grant budget cycles can run on academic years, July-June fiscal 
years, or even October-September fiscal years, which will affect 
when data are needed.) 

• How current and accurate do data and information need to be? 

• What is missing that office personnel deem essential to have (that is, 
data they need versus data they want)? Are data missing because 
they do not exist, or is existing information not accessible to office 
personnel? 

• What questions should office personnel be able to answer about the 
first year of college? 

• What are their perceptions of gaps in data and information? 

Make sure to point out that even though you are asking them these 
questions, it does not mean the data audit will result in complete resolution 
of issues that are raised or that all their data desires will be met. Instead 
you should explain carefully that the intent is to inventory information 
resources and needs to help decisionmakers at the institution decide how 
to proceed. 
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Demand Side Summary: What Is Important to Gather During this 
Process? 

Just as on the “supply side” of the data audit, it is useful to gather as 
much documentation as possible when you visit each site. Documents 
that you should gather from these offices and units include: 

• Copies of recent reports that the unit has submitted using official 
(and unofficial) institutional and office-level data. 

• Copies of data-reporting requirements, including schedules and 
format specifications. 

Samples of the formats or methods the unit uses to analyze data (e.g., 
calculational routines used to compute class loads, advising schedules). 
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Bottom Line: Summary of Procedures for Carrying Out a 
Data Audit 

Procedures for carrying out a data audit are summarized below. Please 
note, however, that while all these steps should be accomplished, it is 
important to be flexible in carrying out this task. Different institutions 
may require somewhat different approaches because of their organiza- 
tional structures and politics. At the same time, some office or individual 
in the past (usually the Office of Institutional Research or its equivalent) 
may have previously accomplished much of the work included in a data 
audit. Where this is the case, it is useful to refer to this previously accom- 
plished work as a starting point. Keep in mind that circumstances may 
have changed, there may be new office personnel, or something may have 
been overlooked in the process. 

1 . Identify offices and units across campus that gather or keep data 
pertinent to the first year of college, as well as those offices and units 
that use or report data. Emphasize that the data audit is a collabo- 
rative institutional process. 

2. Contact appropriate individuals who can fairly represent the 
resources and perspectives of these offices and units. 

3. Set up mutually agreeable times to visit these individuals in their 
offices in order to discuss data sources and data uses. 

4. Approximately one week prior to visiting, send these individuals a 
list of the questions to be discussed and the artifacts or documents 
you will want to collect from them. If a particular office is only a data- 
source office or only a data-use unit, adjust the list of questions 
accordingly. 

5. Conduct the site visit. Ask your questions. Clarify, clarify, clarify. 
Take detailed notes. Collect artifacts and documents. Where appro- 
priate, “walk the process” by simulating the steps a student (or 
faculty/staff member) would take, or follow the path of a particular 
data element from point of collection through data entry, archiving, 
and use. 

6. Before leaving, thank the people involved for their time and help. 
Invite them to contact you if they think of anything further that 
might be of use. Secure an agreement that should there be any 
follow-up questions, they would be willing to respond to them. 
Confirm their telephone numbers or email addresses. 

7. Send thank-you notes to people you visited and interviewed; it might 
be appropriate to copy their managers or bosses as well. 

In order to facilitate a culture of data use and information sharing on 
campus, consider making the findings of the first-year data audit available 
to the campus in the form of a brief report. 
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After the “raw data” generated by the data audit have been assembled 
(interview notes or tapes or transcripts as well as artifacts documenting 
existing data and reports) from offices representing both data sources and 
data users, results should be synthesized to yield a coherent picture of 
data resources and the culture of data use at your campus. Many different 
ways of summarizing results are possible, depending upon institutional 
needs. In some cases, you may want to prepare a single comprehensive 
report on findings. In other cases, it may be more useful to organize findings 
around common topics — for example, lists of first year data resources and 
who has them, recommendations for a “common core” of data (see 
Appendix A), and a report on the current culture of data use on campus. 
As noted earlier, it is also usually appropriate to prepare a brief summary of 
the project and its results for wider distribution to the campus community. 

Outcomes of the pilot study fit into four categories. Institutions found 
that they learned lessons about a) their first-year programs, b) broad data 
issues, c) how to improve data audit implementation, as well as d) refine- 
ments that could better their institutional infrastructures. Examples of 
lessons learned for first-year programs included: 
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• Some institutions found that they did not have program goals for 
their first-year programs. 

• Some were under the impression that there was more tracking 
of first-year students happening on campus than was actually 
occurring. 

• They found that the data audit raised awareness about the entering- 
student program. 

• Student engagement with services, particularly student support, was 
not captured on some campuses. 

• A few campuses discovered that student course evaluations were not 
linked or kept in a database. 
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Examples of lessons learned about data and data use on-campus included: 

• Some institutions found that they do not enter all data ‘‘resulting in 
loss of potentially valuable data.” 

• The issue was raised of who will decide which data are entered when 
budgets are tight and personnel are already overly busy. In fact, data 
may not be entered; but as processes are increasingly automated, 
institutions should keep in mind entering more data as it can be done. 

• Initially, at one institution, staff wanted to eliminate data but by the 
end of the audit many wanted to gather more data. 

• First generation college attendee information was often collected for 
only a particular population of students. The same was true for email 
addresses. 

• At one institution, pilot project administrators discovered the value 
of data collection was challenged in student services areas because 
of the difficulty in seeing the connection between data collection and 
improved services. 

• Data are often not coordinated, shared, or organized well. 

• One institution is now going to put its fact book on the web. 

• Both university and community college personnel were veiy coop- 
erative in supplying needed data. 

• Problems were uncovered with not storing or archiving data, having 
no historical data, incurring large amounts of data loss, and finding 
that needed data were being overwritten or purged. 

• At some institutions no data element dictionary existed. 

• When a college named a data element one way and another name 
was used for external reporting purposes, both names needed to be 
included in the data element dictionary to alert others to the dual 
name. 

• At one institution, the data audit confirmed what they already knew 
about their data and institutional data processes for the first year of 
college. 

• A few pilot schools encountered some resistance from the gatekeepers 
of the data. 

• One institution found that data were available in the data warehouse, 
but too little training was available on how to extract useful data 
creating an accessibility issue. 

About conducting a data audit, pilot institutions personnel found that: 

• Sending out questions/request for artifacts to be collected ahead of 
time meant that units had them available when they came to do the 
interview. 
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Use of worksheets aided them 
data audit. 



in the collection of information for the 



• In some cases, multiple interviews were necessary with different 
people in offices because no single person knew what was possible. 

• Results of the data audit will be used to prioritize future data needs. 
Generally, pilot institutions found that the data audit: 



Uncovered questions in addition to answers. 

Identified redundancies that could be eliminated or opportunities to 
be studied during the next round of strategic planning. 

Will result in an institutional report back on outcomes of the data audit 
to the institutional management team, the President's Council, etc. 

Helped to create an institutional mindset around a total university 
approach — assessing our effectiveness by finding and using available 
data. 



Use of the word “audit” scared some people. 



Made some departments relieved that they were not being singled 
out for review — that this was part of a larger, institution- or unit- 
wide, project. 



Led to increased understanding among committee members regarding 
what different departments do and how they fit into the overall 
functioning of the institution. 



Gave people an understanding that these were issues other institu- 
tions were working on. 



Was instrumental in highlighting the need for evidence in the form 
of data. 



Was a way to involve faculty in data use. 

Identified the need for an institutional Data Definition Committee. 

Results were, in the words of one pilot institution administrator, 
“strikingly consistent. Most people expressed a frustration with the 
difficulties encountered in trying to get data and most people wanted 
access to the same data and were trying to create the same types of 
reports — all independently with absolutely no efficiencies of scale.” 



Whatever the format for reporting ultimately selected, the following 
outputs of the data audit should be fully described: 




Data element lists and specifications including whether it is kept in 
text or numeric format, where the data element comes from, when it 
is entered or the frequency with which it is updated, how consistent 
is the coding, how have differeht units interpreted definitions (for 
example, does a “0” mean zfero or missing?), etc. 



TJse of the 
word “audit” 
scared some 
people. 




115 



32 



DATA AUDIT AND ANALYSIS TOOLKIT 



• File structures and extract schedules including when “snapshots” of 
live transactional databases are taken. 

• Most common uses of data on campus. 

• Common reporting formats/templates. 

• Review security issues. 

• Review need to establish a Data and/or Security Committee. 

• Locus of responsibility for maintenance and control for different 
kinds of data and databases. 

• Recommendations on the various forms of user training needed to 
facilitate use of data resources. 

Existing databases 
focus on the needs of 
record-keepers, not 
information users. 



• Recommendations for methods and approaches for collecting needed 
data that are not currently collected by the institution. 

In preparing to summarize the outputs of a data audit, it is helpful to be 
aware of and review some frequently encountered findings of such an 
exercise at other campuses. Among them are: 



T here is always a 
tendency to invent 
brand-new data 
collection efforts 
every time a new 
information need 
is identified. 



1 . The need to reposition student databases to examine behaviors 
from the student rather than from the institutional point of view. It 
is not unusual for institutions to collect a lot of data on students and 
student behavior, but not to use this information to investigate 
questions like, “How did students act out the first year curriculum 
in terms of course-taking?” or “How many first-year students visiting 
academic skills centers did so more than once each term?” One 
reason for this situation is that existing databases focus on the needs 
of record-keepers, not information users. Therefore they are typi- 
cally hard to access, hard to use, and organized cross-sectional ly 
rather than longitudinally. 

2. Opportunities to collect data more systematically using processes 
already in place and existing points of contact with students. There is 
always a tendency to invent brand-new data collection efforts every 
time a new information need is identified. Also, administering 
surveys using different methodologies across terms and years can 
alter outcomes and results, which can create false perceptions of 
change. This situation leads to students being repeatedly surveyed. 
The point here is to be more deliberate about taking advantage of 
contacts/opportunities that are already available. Examples of these 
include student orientation sessions at which additional surveys 
might be collected, placement testing, student evaluation of instruc- 
tion, and face-to-face advisement sessions. Emerging technology 
also provides opportunities. For example, more and more libraries, 
bookstores, residence halls, and student service offices are using 
“Smart Card” or “Card Swipe” systems to record use and atten- 
dance, creating an automatically generated record of contact and 
intervention for each student that can be recovered and used more 
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broadly. Or, for students who access offices or services online, web 
usage statistics are another form of data to be collected. In addition, 
being deliberate in gathering and using data will likely reduce 
duplication of effort on campus and wasted resources. 

3. Unclear or inconsistent definitions across units for similar data 
elements. This mismatch can occur in both directly extracted and 
locally constructed or calculated data elements. Every institution can 
benefit from having clear definitions for data elements and distrib- 
uting documentation containing those definitions widely to everyone 
on campus. For example, offices may use different definitions of first- 
time students; some may use first-time, full-time undergraduates, 
others may use the entire population of first-time undergraduates, 
which would include both full-time and part-time students. 

4. Self-reinforcing “spirals” of misperception on the part of those 
responsible for collecting/archiving data and those who seek to use 
it. A frequent finding of a data audit, for example, is that user com- 
munities have given up trying to obtain some kinds of data because 
of the difficulty of getting it — resulting in a perception by data 
communities that there is “no demand” for these data by users. 

As you seek to summarize the results of the data audit on your campus, 
it is important to be sensitive to these common issues, and to be reassured 
that they are not unusual. Furthermore, by being open to suggestions, you 
may learn new avenues that data may be beneficial to all parties involved. 

The final step is to close the feedback loop to create a true culture of data 
use by communicating results of the data audit of the first year of college 
widely and taking action based on results of the analyses. 



Th e flnal step is 
to close the feedback 
loop to create a true 
culture of data use 
by communicating 
results of the data 
audit of the Grst year 
of college widely and 
taking action based 
on results of the 
analyses. 
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A principal outcome of many data audits is an institutional determination 
to create a ''common core” of data that can be used by multiple units to 
address similar issues or to conduct consistent investigations. Each data 
element in the common core has a collectively agreed-upon definition. 
Typically, all such data elements are maintained in an accessible database 
environment for use by a variety of oifices and functions at the institution. 
Additional analytical files are often derived from the common core, which 
contains data periodically extracted from operational databases. Common 
core definitions, and easy access to data, provide a consistent basis for 
units to conduct internal institutional analyses and external reporting. 

This section describes a recommended list of common core data elements 
for examining the first year of college. Such a common core usually includes 
data about a range of entities including students, courses, course enrollments, 
applicants for admission, program affiliation, participation in particular 
activities, faculty and staff, facilities and equipment, and resources and 
expenditures. In most cases, these elements will be maintained in the 
same formats used in the "parent” databases from which they are drawn. 
In some cases, however, the recommended data element is a “summary” 
element — derived or calculated from one or more existing data elements 
(for example, the total number of terms a student has been enrolled at an 
institution or a student’s credit hour completion rate). In a few cases, recom- 
mended data elements are not collected anywhere on the campus but are 
sufficiently useful or important for the institution to find a way to gather them. 

To be included in the common core, a particular data element should meet 
one of two criteria: a) it is required for important management or decision- 
making purposes by multiple units or departments on campus, or b) it is 
useful for wider institutional or unit-level planning or evaluation purposes 
such as program review, budgeting, or consistent external reporting. 
Inclusion of data elements in the common core does not necessarily imply 
that all of them must be held in a single database. All should, however, be 
defined in common and should be easily accessible to potential users. 



(Common core 
definitions, and easy 
access to data, 
provide a consistent 
basis for units to 
conduct internal 
institutional 
analyses and 
external reporting. 



The Toolkit’s recommended contents of a common core of data to be 
assembled to examine the first year of college are offered as a place to start 
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the discussion on your campus. Depending on capacity and circumstances, 
you will undoubtedly come up with a somewhat different list. After your 
institution’s version of the list is agreed upon, each data element included 
in the common core is typically defined and documented for use by the cam- 
pus community. For each element, documentation is usually provided on: 

• The definition of the element, if the element is derived from other 
operational data elements or involves calculated statistics, appropriate 
calculation rules should be included in the definition. This requires 
determining the form of the data element in existing databases. (We 
will use “age” as an example. For the purposes of the example, we 
want “age” available in numeric format, two characters long.). A 
data element may be: 

/ Present in the form required (in our example, the field would be 
filled with two-digit numbers). 

/ Present, but in need of minor recoding (in this case, “age” 
would be in the field as Arabic numerals that have been entered 
as text with a character in front of them, often a ‘). 

/ Present, but not in the proper structure (in this case, “age” might 
be found as text such as “twenty-one”). 

/ Not present, but can be created from a combination of existing 
elements (in this case, “age” is derived, or calculated, from a 
student’s birth date and the current date). 

/ Not currently present or collected in any form (this would be 
the case, if neither “age” nor “birth date” were gathered) and 
leaving no way to determine age. An example of the element 
being present but not useful would be if students chose from 
age ranges (<18 years old, 19-24 years old, >25 years old).). 

• The source and data collection procedures used to collect the ele- 
ment; such documentation should include the timing of data collec- 
tion, should note the instrument or procedure used to actually obtain 
the data, and the office(s) responsible for collecting, entering, and 
maintaining the data. 

• The principal clients or uses for the element. 

Each type of data in the recommended common core is briefly 
described below. Individual data elements recommended for inclusion 
under each subsection are provided in Appendix A. The displays in this 
appendix list the recommended data elements and provide additional 
information about each under the following headings: 

• Element Name . Describes the data element using the name typically 
used to describe it at most institutions. 
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° Source . Notes the primary current source/location of the data element 
in existing operational databases. At some institutions it is possible 
to draw the complete set of data required for the common core from 
one source; in other situations, this is impossible. When elements are 
maintained separately in one or more parent systems, definitions, 
formats, and collection procedures should be standardized when 
these elements are included in the common core. Instances where 
some kind of reconciliation is needed should be noted under the 
“Comment” heading — as illustrated by several of the entries in 
Appendix A. 

• Length. Indicates the anticipated character length of the data field 
required (for example, “age” would normally be a two-digit field, 
but it is conceivable that someone over 100 could one day enroll. Or, 
lengths of data fields for most English-based names may be too long 
or too short for names of students from foreign countries). 

• Type . Indicates, so far as is known, whether the element is a charac- 
ter or numeric field. (Note: For both Type and Length, the values 
given are approximate. In most cases, however, they correspond to 
the way the element is currently maintained in the parent database 
from which it is extracted.) 

• Comment . Used to provide brief comments on particular data elements 
where needed — for example, to note the source data for a required 
calculation or to highlight the fact that inconsistencies in coding or 
definition exist that need to be resolved. 

Student Data Elements 

This category contains standard, commonly used student descriptors of 
several kinds including demographics, educational background, current 
(and past) enrollment status at the institution, as well as academic 
standing and performance. Most of these data elements already reside in 
the student information system, but current data and/or coding structures 
may need to be modified. If the data audit indicates that particular needed 
items are not currently included in any current dataset, a means to collect 
these elements should be considered by your institution. 

Specific comments associated with these data elements include the 
following (see the lists included in Appendix A for a comprehensive list 
of suggested data elements): 

• Name/address . Full names and addresses for students are used prin- 
cipally for generating mailing labels; these elements are consequently 
not included in recommended analytical files, but they may be useful 
for other purposes. Includes a student identifier (often social security 
number although many institutions are choosing to create unique 
student identification numbers and not use SSN anymore) which is 
used as the link across the many databases. 
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° Demographics . These elements include gender, race, ethnicity, birth 
date, age, county of origin, place of residence, etc. (See the data element 
listing for a comprehensive list of suggested demographics of interest.) 

° Parents’ education leveL occupation, and employment . Institutions do 
not typically collect this information. However, current literature 
suggests a relationship with these factors and retention, so institutions 
might consider including these data elements for analysis. 

° Test scores . Skills testing and placement tests are key factors for success 
in the first year at many institutions. These data need to be drawn 
from whatever parent database they are kept in. Note that sometimes 
these data are included in the regular student information system; 
sometimes they are kept in a variety of other places including the 
placement office, assessment office, or even individual departments 
(usually English and math). 

° Enrollment status elements . These consist of data about items like 
fill 1-time/part-time status, student’s entering major, and other elements 
that describe how the student is classified. They duplicate much of 
what is typically in the student information or registrar’s system, 

® Goal/intent elements . These indicate such things as the reason a student 
is seeking a degree and whether (s)he intends to complete a degree 
at the institution. Increasingly institutions are asking this question of 
their students, often when a student registers for a new term. These 
data elements have proven valuable at many institutions and serious 
consideration should be given to systematically collecting them if 
your campus does not currently do so. 

® Financial aid elements . The level of detail collected about each 
student in the financial aid system may or may not be appropriate 
and should be thoroughly discussed, especially in the light of privacy 
guidelines. Full information about each aid source and amount is 
usually not needed. But maintaining only an aid flag in the common 
core may not provide the required level of detail for tracking such 
things as the effectiveness of financial aid packaging or the impact 
of growing indebtedness. One useful alternative is to create three 
elements: 1) financial aid fund source (e.g., state/federal/private gift), 
2) financial aid fund type (e.g., grant/loan/work study), and 3) financial 
aid level-of-need (categories here are often assigned at the discretion 
of the Financial Aid office, based on characteristics of an institution’s 
student body). Security considerations are also important when dealing 
with financial aid information, and you should be sensitive to the legal 
responsibilities of financial aid offices to protect the privacy of these 
records. It is usually possible, however, with tact and persistence, to 
work out an arrangement where some summary data elements about 
each first-year student can be extracted or constructed. 
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Course Data Elements 



These elements are also, for the most part, drawn from regular 
student information systems. Course-level data describe the course in 
catalogue inventory terms such as days of the week, time, instructor, etc. 
and consequently will change little. Section-level data, however, will 
likely change each term. Section data imply, and are actively linked to, a 
corresponding course. Note that recommended section-level data include 
summary information about enrollments and other topics. When con- 
ducting a data audit on the first college year, you may want to keep only 
information on those courses taken by first year or lower-division students. 

Specific comments associated with these data elements include the 
following: 

• Prerequisite/co-requisite courses . Although these data are frequently 
carried by regular student information systems, they are just as often 
maintained elsewhere. Departments, schools, and colleges often create 
their own records for prerequisite and co-requisite checking required 
for both internal management and external reporting purposes. 

• Key links . In order to use data elements from different databases, 
each database should include a field or set of fields that uniquely 
identifies each record stored in the database. This information is called 
the key link. For courses. Course Department, Course Number, and 
Course Section Identifier are usually maintained as key links among 
databases. These should be examined carefully to determine if they 
are indeed the appropriate key links to use, and if they are defined 
and maintained consistently across departments and units. 

• Percent instructor assignments . These data elements address how 
much of a given instructor’s time or load is demanded by the 
course/section in question. Often this information is maintained in 
registration databases, but equally often it is maintained separately by 
department, or kept manually in a Dean’s or Academic Affairs office. 
This data element is useful in answering questions such as, “How 
many first-year students have full-time professors as instructors?” 



Wh, 



r hen conducting 
a data audit on the 
first college year, 
you may want to 
keep only infor* 
mation on those 
courses taken by 
first year or lower- 
division students. 



Course Enrollment Data Elements 

These data elements document individual interactions between a par- 
ticular student and specific course section like grades and credits earned 
or whether the student dropped the course. Consequently, they are often 
termed “course/person” data elements. The intent of these student and 
course data combinations is to obtain a more complete picture of each 
enrollment. Such data elements are particularly useful in constructing 
analyses of course-taking patterns or of the potential for peer contact and 
interaction in the classroom by systematically presenting data on the 
number of lecture courses and laboratory or team environment courses 
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.Admissions 
data elements 
usually mirror 
the data already 
kept in regular 
admissions and 
student information 
systems. 



Data about 
part-time or 
grant-supported 
personnel are 
frequently 
maintained in 
separate personnel 
systems. 



students take. Like section data, these data elements include a range of 
summary items useful in examining course performance. Many large uni- 
versities conduct these types of analyses in order to “manage” their general 
education offerings. These analyses are also quite useful for medium and 
smaller sized institutions as well. 



Admissions Data Elements 

Admissions data elements document individual applicants to the institu- 
tion and usually mirror the data already kept in regular admissions and 
student information systems. The same comments apply to these data 
elements as to the student data elements already described. Also, the level 
of detail for admissions monitoring should be carefully examined because 
different kinds of students — such as athletes, artists, or musicians — often 
go through different admissions processes. Many students, therefore, 
may have different levels of detail in their records about such things as 
high school coursetaking and performance. Because these data are often 
particularly useful for analyzing the first year of college, serious consid- 
eration should be given to obtaining and loading such data for all first- 
year students on a one-time basis. 



Co-Curricular or Extracurricular Data Elements 

Because the first year of college focuses on co-curricular and extra- 
curricular activities and programs, particularly orientation programs, it 
may be useful to keep information about participation in a database. Of 
interest here are residence life, involvement with student organizations, 
leadership programs, athletics and intramurals, involvement in volunteer 
work, or participation in service learning. 



Personnel Data Elements 

These describe individuals employed by an institution — both instruc- 
tional and non-instructional — and the nature of their relationship with the 
institution. These data elements will give a fuller picture of who teaches, 
advises, and otherwise works with first year students. Elements noted for 
“all employees” apply to both instructional and non-instructional staff as 
well, with those described under “instructional staff” intended to be applied 
only additionally to faculty and instructors. It is recommended that all 
employees — including part-timers — eventually be included in this structure. 
Note, though, that data about part-time or grant-supported personnel are 
frequently maintained in separate personnel systems. 

Important issues here are security and compatibility with other data 
files. The data elements listed come primarily from established institution- 
level personnel systems. Many elements also included in the student 
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information system and other personnel databases (such as databases on 
adjuncts) may not correspond to these definitions. If the audit determines 
that this is the case, a common data element structure should be developed. 
Many of the recommended personnel elements also raise issues of privacy 
and security. As a result, consideration should be given to providing 
controlled access to these fields on an element-by-element basis. 

Specific comments associated with these data elements include the 
following: 



• Key link . Social Security Number (SSN) is proposed as the primary 
key link for both employees and students, but any common identifier 
can be used as long as it is consistently applied. In some cases, insti- 
tutional policies restrict the use of SSN in this manner, and you should 
check specifically with the personnel and registrar’s offices to see if 
this is the case. Some institutions have a locally generated “instructor 
code” that identifies a single person consistently throughout institu- 
tional databases. 

• Demographics . These elements provide typical descriptors of faculty 
and staff. They are especially useful in a first-year context to help 
determine the kinds of individuals that first-year students are exposed 
to — for example, the degree to which students of color are likely to 
find peers or whether regular faculty are teaching first-year classes. 

• Experience . Data should be maintained that would allow access to a 
full faculty vita; many institutions find such data extremely useful 
not only in themselves but to profile the kinds of faculty and staff 
who interact most frequently with first-year students. To do so, keep a 
word processing document with vitas that can be linked via a keyword 
(for instance, personnel number) to a particular individual. 

• A ppointments . At some institutions, there is a distinction between 
the department of appointment and the department of assignment. 
Efforts need to be made to capture these distinctions. In addition, 
care must be taken to capture appointment information from all 
sources (e.g., regular appointments, grant-funded appointments). 



Finance Data Elements 



important 
issue here is the 
compatibility of 
data structures 
with other 
systems. 



.A^nother important 
question is what 
level of detail should 
be maintained for 
analytical purposes. 



These elements provide data on the status of individual accounts and 
are drawn from finance systems. Like the personnel data elements 
described previously, an important issue here is the compatibility of data 
structures with other systems. Another important question is what level of 
detail should be maintained for analytical purposes. In analyzing the first 
year of college, for example, relevant finance questions typically have to 
do with the personnel costs associated with delivering first-year classes 
and programs, so initial attention should be devoted to data elements that 
bear on these questions. 
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Physical Facilities Data Elements 



These provide data on the condition and characteristics of individual 
instructional spaces. They will be especially useful in supporting studies 
of the physical environments typically encountered by first-year students. 
For example, does instruction of first-year students primarily occur in large 
lecture halls? Or, do first-year students often face daunting distances 
between their classes held back-to-back? 



Summary 



Does instruction 
of first-year students 
primarily occur in 
large lecture halls ? 



In summary, to be included in the common core, a particular data 
element should meet one of two criteria. It should either be required for 
important management or decisionmaking purposes by multiple units or 
departments on campus, or for wider institutional or unit-level planning or 
evaluation purposes. Documentation should be provided on a) data element 
definitions, b) source and data collection procedures, and c) clients or uses 
for each element. Then, list for each data element the element name, its 
source, length, type, and any comments about the data element. Eight 
types of data elements are common: student, course, course enrollment, 
admissions, co-curricular or extracurricular, personnel, finance, and physical 
facilities. 
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BUILDING A LONGITUDINAL 
TRACKING CAPABILITY 



As noted in the Toolkit’s first section, a fundamental shift of perspective 
is required in determining “vv^hat happened” and ‘Svhat mattered” in the first 
year of college that involves moving from a cross-sectional to a longitu- 
dinal perspective. Most data about students — whether maintained in the 
regular student records system or collected through questionnaires or 
interviews — are organized in terms of the point in time they were gathered. 
Thus, student record systems are structured by term (quarter or semester), 
with student enrollment records distinguished from one another in this 
manner. Similarly, questionnaire data are typically maintained in separate 
files — one for each survey administered. 



Th e conceptual 
requirements for 
tracking students 
over time are 
straightforward 
but may be difTicult 
to fulfill in practice. 



What we really want for analysis, though, are data organized by 
student — analogous to a transcript that assembles data about what 
happens to them over time. Data of this kind enable us to really get inside 
the first-year student experience to examine such things as patterns of 
retention and interrupted enrollment, the order in which courses are taken 
and completed (or dropped), and any association between academic 
success and participating in particular kinds of programs, activities, or 
interventions. For example, we may be especially interested in questions 
like the effectiveness of current basic skills placement policies at the 
institution, or the relative effects on student retention of participation in a 
first-year-seminar-type course. Doing this requires us to draw data not 
only from multiple sources but also at multiple points in time in a given 
student’s career. How to approach this task is briefly addressed in this 
section of the Toolkit. 



The Conceptual Basis of Student Tracking 

The conceptual requirements for tracking students over time are 
straightforward but may be difficult to fulfill in practice. Minimally, 
however, two capabilities are required: a) creation of a comprehensive 
longitudinal picture of student progress that reflects the manner in which 
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R^odels are of 
limited value if 
they do not take 
into account the 
vastly different 
kinds of students 
typically found in 
an entering first- 
year population. 



students of different kinds move into, through, and out of the institution; 
and b) identification of a number of distinct behavioral groups of students 
(for example, part-time, single parent females whose goal is entry-level 
employment) described in terms of cross-cutting characteristics (in our 
example, part-time, marital status, gender, number of dependents, goal). 

Satisfying the first requirement demands a conceptual scheme that repre- 
sents student progress through the institution as a set of linked events and 
decisions. Figure 5 presents an overview of such a model for students who 
progress through their first year of college. The model contains distinct 
components for both admissions and student behavior once enrolled, but the 
two are linked in order to represent respective or simultaneous impacts in 
each of these phases. The logic of the model is to represent student 
progress as a series of discrete decision points and experiences through 
which each student must pass. Furthermore, decision points are of two 
distinct types — those under the control of the student and those determined 
by institutional actions or policies. Matriculation rate, voluntary withdrawal, 
and participation in various types of first-year programming are examples 
of the former, while acceptance rate, mandatory placement, and academic 
good standing are examples of the latter. Together, these experiences and 
decision points constitute a complete chain of events that operate in 
concert, and that determine the status of a particular group of students at 
a particular point in time. 



Such models are of limited value, however, if they do not take into 
account the vastly different kinds of students typically found in an 
entering first-year population. Different kinds of students may behave in 
systematically different ways. Therefore, it may be necessary to examine 
longitudinal progression separately for different types of students. But what 
kinds of differences are important and how should such subpopulations 
be defined? 



Institutional researchers traditionally break down student populations 
in two ways — demographically and by program area. Such breakdowns 
are generally done one at a time. Statistics for items like first-to-second- 
term retention, for example, are commonly calculated and reported 
separately for males and females, for older and younger students, or by 
department or major. While this approach will certainly provide some 
insight, real behavioral groups of students more often consist of combi- 
nations of such factors. An African-American male who is 1 8 to 2 1 years 
old and seeking entry-level occupational skills, for example, may have a 
far different set of expectations and experiences than a female liberal arts 
student attending part-time during the day to fulfill general education 
requirements. Appropriate analytical groups are, therefore, best identified 
by disaggregating total enrollment by a number of crosscutting variables 
in combination. What those crosscutting variables might be depends on 
an institution and its particular student body. In some cases, race and/or 
ethnicity by program area will be important. In other institutions, resi- 
dence and gender will be of importance. (For references see Tinto (1987), 
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Beal and Noel (1980), Pascarella and Terenzini (1991).) The choice of 
which variables to use will depend on both the nature of the institution 
and the characteristics of the first-year population under study. Figure 6, 
for example, shows such a multiple disaggregation for a small rural com- 
munity college. The disaggregation shown in Figure 6 was accomplished 
by combining data for all students and then sorting by different groupings 
until five distinct groups of students that captured most of the entire stu- 
dent body were determined. 



The right-hand side of this breakdown represents a set of logical 
possibilities for cross-cuts among a set of five demographic and enrollment 
variables (location, program, time, status, gender); rarely will all such 
logical possibilities contain substantial numbers of first-year students. 
Rather, students tend to cluster in certain categories, and these can then 
be reaggregated for analytical purposes. In the example shown, 96.2% of 
the population is accounted for by the five distinct behavioral groups of 
students listed at the bottom of the figure. Each of these groups, once 
identified, was studied separately. This was important in this case because 
it turned out that the factors responsible for persistence and academic suc- 
cess for each group were different. A “generic” student success program 
would, therefore, have made little sense and would likely have had little 
impact. 



A “cohort” is a 
group of students 
who entered the 
institution at the 
same point in time. 



The Data Requirements 



of Student Tracking 



Most institutions conduct longitudinal studies of entering students by 
creating discrete files for entering “cohorts” of students. A “cohort” is a 
group of students who entered the institution at the same point in time — 
for example. Fall 2001 or Spring 2002. Cohort-based files contain a stu- 
dent-by-student enrollment history for members of the cohort over a des- 
ignated number of consecutive terms, drawn from the “common core” of 
recommended data elements described earlier. The data in such files 
enables us to answer the question, “What is the enrollment pattern of each 
individual in the cohort?” To construct a file to answer this particular 
question would depend upon the availability of both student-record and 
questionnaire-based information, as described in the previous section, on 
a term-by-term basis for the first year of college and beyond. 



Most longitudinal data files of this kind share a number of characteristics. 
Every entering student is assigned to a cohort, based upon his or her first 
term of academic history, and the student remains a member of that 
cohort thereafter. Separate files are typically maintained for each cohort, 
and all analysis and reporting is typically accomplished on a cohort basis. 
Cohorts may be identified in a number of ways. The definition of cohort 
used at your institution must be used consistently. One way to define 
cohorts is that they are a group of students identified by first term of 
active (at least one credit hour) enrollment history at the institution. 
Complete cohorts of students entering in a particular term, rather than 




129 



46 



DATA AUDIT AND ANALYSIS TOOLKIT 



samples, are generally used in order to provide credible program-level 
statistics. 

The structure of cohort data files involves assembling data elements of 
several different kinds (see Figure 7), drawn from the recommended 
common core (see Appendix A). A first set of data elements, drawn 
largely from registration and admissions records, is compiled once — at 
time of entry — ^and comprises the first portion of each longitudinal 
student enrollment record. Types of data elements generally included in 
this “fixed” portion of the record are data on demographics, on educa- 
tional background, on basic skills and need for remediation, and on initial 
enrollment status. Additional data elements are then added to this basic 
record at multiple points for each subsequent term that the student is 
enrolled. One set of elements is drawn from term enrollment files at the 
time of official census date, and reflects student enrollment behavior up 
to that point. Types of data elements included are usually program and hours 
attempted in coursework, remediation status, and remediation perform- 
ance. Another set of data elements is captured at the end of the term and 
includes such things as course completion, academic performance, and basic 
skills levels attained. A third set of data reflects the various experiences 
that a student may have engaged in during the term — for example, partici- 
pation in tutoring or counseling sessions, study-group membership, or 
first-year-experience programs. Such data, as noted earlier, are typically 
derived from surveys or from the records of individual student-service 
and academic units. 

The longitudinal file layout shown in Figure 7 is documented as though 
it was composed of “fixed-length” records — one for each student in each 
cohort. This means that all of the information for a given student is main- 
tained in a single record, with portions of the record corresponding to 
potential terms of enrollment. If a student is not enrolled for a given term, 
the portion of the record corresponding to that term is left blank. The 
assumption of a fixed-format record structure is usually made for ease 
of communication and to facilitate the use of commercial statistical 
packages in generating reports and manipulating data. But this is not the 
only way such data files need be constructed. Many analysts maintain 
term data in separate files, for example, and link them together only 
when they are needed to conduct a particular kind of study. The actual 
method used will depend on local computing arrangements, software, and 
the preferences/experiences of those conducting the analyses. But the 
conceptual requirements of cohort-based organization and a data file 
consisting of a set of sequenced term-based “snapshots” of student 
behavior remain unchanged. 

One question that commonly arises is which types of students should 
be included in analyses. Because many non-traditional students are 
single-term enrollees, some institutions elect to include only those students 
who are seeking degrees, or only those who express an intention to persist 
for more than one term when they look at experiences during the first 
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year of college. Others include all students, with the provision that non- 
traditional students can be separated out for analysis at a later point. 



Summary 



In order to build a longitudinal tracking capability, data organized by 
student are needed. This structure allows researchers to examine the first 
year of college including patterns of retention and interrupted enrollment, 
the order in which courses are taken and completed (or dropped), and any 
association between academic success and participating in particular 
kinds of programs, activities, or interventions. The conceptual require- 
ments for tracking students over time are straightforward but may be 
difficult to fulfill in practice. Two capabilities are required. The first is 
creation of a comprehensive longitudinal picture of student progress 
that reflects the manner in which students of different kinds move into, 
through, and out of the institution. The second is identification of a number 
of distinct behavioral groups of students described in terms of cross-cutting 
characteristics. 



'The conceptual 
requirements for 
tracking students 
over time are 
straightforward 
but may be difficult 
to fulfill in practice. 



131 



ANALYZING DATA 



One reason for assembling comprehensive databases about the first 
year of college is that we do not always know exactly how the data will 
be analyzed. As a result, we need a flexible store of data, ready to be 
tapped rapidly in response to a variety of questions as they come up, and 
capable of quickly disaggregating, or segmenting, results for different 
student populations for comparison. In such situations, the specific analy- 
ses to be undertaken cannot be fully predicted in advance. However, par- 
ticular kinds of analyses and indicators related to the first year of college 
can be foreseen. It is frequently a good idea to develop a capacity that 
allows such reports to be generated automatically for both the entire first- 
year population and for designated subsets of that population. This sec- 
tion (and its related appendix) briefly addresses both kinds of reporting 
and provides some reporting templates that illustrate the latter capability. 

A first question is what should be the focus of such analyses? Valuable 
studies of what happens to students in the first year of college can be 
categorized around the following types of studies: 

o Overall Student Flow . The object of such analyses is to determine 
overall patterns of enrollment, persistence, stopout (when a student 
temporarily withdraws from an institution (Tinto, 1987)), and 
reenrollment for particular types of students. Classic statistics like “fall- 
to-fall retention rates” (which calculate the proportion of a given 
entering cohort of students that returns to the institution for a second 
year) and “degree-completion rates” (which calculate the proportion 
of an entering cohort that completes a degree or credential within 
a designated period of time) are commonly reported results of such 
analyses (see “Term-to-Term Progression Report” in Appendix B). 
More specialized analyses, within the first year of college, based on 
the same principles include term-to-term persistence rates, within- 
term course withdrawal rates, or “stopout” studies intended to look 
at whether students withdraw for a period of time and then reenroll. 
Such analyses are usually most useful when they are conducted in 
parallel for different types of students — for example, students drawn 
from different demographic or enrollment status groups. 
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® Overall Academic Performance . Analyses of this kind are similar in 
design to overall student flow studies, but concentrate on how well 
different kinds of students perform in their coursework (see 
“Summary Progress Reports” in Appendix B). While grade-point 
average is frequently used as a dependent variable in such analyses, 
other kinds of performance variables are equally appropriate 
including such measures as the proportion of credits enrolled for that 
are successfully completed, the proportion passing all courses (or a 
particular key course or set of courses) with a “C” grade or better, or 
the proportion remaining in academic good standing. A particular 
topic of interest here for the first year is student success in remedial or 
developmental courses and/or performance in collegiate skills courses 
like English Composition or a variety of mathematics courses. 
Again, such studies are most valuable when the overall performances 
of different groups are compared with one another. 

• Patterns of Experience . Somewhat more complicated to accomplish, 
but often very revealing, are analyses designed to investigate what 
happened to particular types of first-year students in detail. One 
prominent example is course-taking studies (see “Coursework Status 
Reports” in Appendix B) that look at such things as the order in 
which particular courses are taken (and, more particularly, whether 
designed prerequisite sequences are followed), the length of time 
elapsed between taking a particular skills-development course and 
when the skill in question is first applied (math skills, for example, 
can atrophy rapidly if they are not applied promptly in subsequent 
coursework), or the extent to which students are taking coursework 
across a wide variety of fields rather than taking a related body of 
courses simultaneously (that is, breadth vs. depth). Analyses of this 
kind are again particularly applicable to basic skills or remedial course 
sequences, which are usually designed to be taken in a particular 
order. 

Another important factor of experience to be investigated is student 
credit loads, which may vary considerably both during and across 
terms. Sometimes students “shop” for courses during an add-drop 
period in order to identify those they find most appealing (or think 
they can pass easily). Other students may “over-enroll” by attempting 
more courses than they might be able to complete (ironically, this is 
sometimes a “catch-up” strategy practiced by students who have 
failed to complete one or more courses in their first term of enroll- 
ment and that often puts them further at risk). A final dimension of 
experience concerns the out-of-class or co-curricular experiences 
that students may have engaged in. If data are available about such 
things as participation in tutoring, formally organized study groups 
or learning communities, or whether students visit academic skills 
and counseling centers, they can be used to create portraits of both 
overall participation in such experiences and their effectiveness. 
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® Early-Warning . Slightly more sophisticated are analyses that try 
to put all of these data together to create indicators of potential 
academic difficulty. For example, analyses of past cohorts of entering 
students may reveal patterns of association between particular clusters 
of incoming student characteristics and later academic difficulty, 
interrupted enrollment, or particular sets of course-taking behaviors. 
These characteristics can cluster around social risk, academic risk, 
etc. If these prove statistically robust, they can be used to help create 
profiles of “at-risk” students whose progress might be more carefully 
monitored from the outset. When engaging in such studies, though, 
it is always important to remember that they are based on statistical 
tendencies, not preordained “fact.” It is therefore critical to use such 
indicators judiciously and appropriately. 

o Program Effectiveness . Another way to put all of these data together is 
to try to answer questions about the relative effectiveness of particular 
aspects of the first year of college in promoting persistence or aca- 
demic performance. Examples might include the effectiveness of 
student participation in voluntary orientation programs or study 
groups on such outcomes as fall-to-fall retention, course completion 
rates, or overall grade performance. More narrowly-defined examples 
include the relationship between enrollment and performance in 
collegiate-skills-building classes like composition and math, and 
related later coursework that requires such skills (see “Coursework 
Placement/Effectiveness Reports” in Appendix B). Longitudinal 
data files are critical for accomplishing such studies because experi- 
ences occurring at one point in a given student’s enrollment history 
need to be associated with measures taken at a later point. Again, when 
conducting such analyses, it is important to remember that what works 
for one kind of student may not work for others — so disaggregation 
is important. It is also important to try to disentangle the many factors 
that may be at work. For example, the apparent “effectiveness” of 
a particular program element may simply be a result of the fact 
that certain kinds of students participate, not because the program is 
inherently beneficial. 

Many variations on these basic types of analysis are possible, and they can 
be combined in multiple ways to yield valuable tools for understanding 
the impact of the first year of college. Note that undertaking customized 
“as-needed” studies like these is not the only way a “common core” of 
data assembled in a longitudinal database structure can be utilized. Indeed 
for ease of use and access, many institutions choose to preprogram a set 
of standard report templates that can be automatically generated using 
commercially available software packages (like SAS, SPSS, Excel, and 
Access). Typically, such reports are designed to summarize the status and 
behavior patterns of a particular cohort of first-year students. They are 
usually set up in a “matrix” or tabular form in which the columns of the 
report represent performance variables (like the percent of an entering cohort 
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retained in the second term or the proportion completing key courses with 
a grade of “C” or better), and the rows of the report represent specific 
characteristics of the student body (like gender, entering academic skill 
level, or whether the student participated in various first-year experiences). 
Examples of some of the most commonly used reports of this kind are 
provided in Appendix B, together with documentation that indicates how 
each of their entries should be constructed. 

Once they are set up, basic reporting templates like these can be easily 
modified and replicated for different populations. More importantly, using 
the population selection capabilities of commercially available statistical 
packages, they can be generated automatically for any first-year popula- 
tion that can be defined in terms of combinations of data elements in the 
database. This disaggregation enables analyses that “drill down” into the 
first-year student population to examine exactly how particular types of 
students experience the curriculum and co-curriculum, and how the impact 
of these experiences on students may differ. For example, to investigate 
the impact of a particular intervention (participation in a student orienta- 
tion program) an analyst could run a standard report showing academic 
performance twice — once for students who participated in the program 
and once for those who did not — and compare the results. Because the 
row variables in both cases are the same, the comparative impact of the 
program intervention can be examined for each type of student included. 

More sophisticated kinds of data analyses using the common core or 
longitudinal files can also be undertaken using multivariate statistical 
techniques like regression and cluster analysis. These techniques allow the 
independent effects of particular variables to be investigated after con- 
trolling for various other factors. This useful capacity can be used to sort 
out such questions as whether participation in a first-year program 
mattered or whether any changes in student success observed were really 
due to the characteristics of the students who participated. In summarizing 
the results of analyses for decisionmakers and program participants, 
though, it is usually wise to present data in tabular or graphic form. As a 
result, many analysts use multivariate statistical techniques to explore and 
make sense of the data they are examining, and then communicating what 
they find in relatively straightforward terms, foregoing the presentation of 
all of the statistical manipulation that went into key findings. Nevertheless, 
should those statistical calculations be of interest, analysts will have them 
available “in their back pockets” for sharing and consultation. 
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Embarking on a data audit designed to support and improve the first 
year of college is a significant step for any campus. Hopefully, the data 
audit will lead in the direction of a more comprehensive and intentional 
approach to collecting and analyzing information about the first year of 
college. In undertaking it, we want to reemphasize some of the points 
made at the outset of this Toolkit. 

First, always remember that “truth” lies in the variations. Real people 
with real differences make up the first-year population at any college, and 
the same is true of all our faculty and staff. So avoid being misled by 
averages and other “central tendency” results that are meant to apply to 
all students and situations. Instead, disaggregate the data as far as you can 
to uncover the many differences in experience and situation that probably 
exist. 

Second, results of assessments and evaluations are almost always more 
useful in generating further questions and in stimulating reflective 
faculty/staff conversations than in “making judgments” about program 
performance. It will always be important to use available data to create 
occasions for further reflection and conversation about collective action, 
rather than employing data to point fingers and blame units or individuals 
for shortfalls in performance. Indeed, the metaphor of scholarship is 
usually effective in such situations: the object of evaluation is nothing 
more than to turn the tools and habits of systematic investigation that we 
were all trained to practice in our disciplines onto our own core enterprise 
of facilitating student success. Like scholarship in any field, the process 
of gathering and analyzing data about the first year of college should be 
open, deliberative, systematic, and ongoing — never really completed. 

Third, consistent with the view that engaging in assessment and evalu- 
ation is a profoundly educative act, students should be involved in the process 
as fully as possible. The best data systems are designed not only to 
provide evidence to decisionmakers but also to enable feedback and inter- 
vention in individual cases. Indeed, the data audit process may uncover 
numerous opportunities to communicate information back to students about 
their own strengths and weaknesses, or to introduce such information 
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data audit designed 
to support and 
improve the first 
year of college is a 
significant step for 
any campus. 
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ongoing — never 
really completed. 
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'Xliose engaged 
in assessing and 
evaluating flrst- 
year-of-college 
programs should 
always bear in mind 
that no matter how 
good things are (or 
you think they are), 
they can always be 
improved. 



into the advisement relationship. At the program level, moreover, 
student participation in the process of interpreting evaluation results is 
often especially valuable. For example, focus groups of students are 
frequently useful in helping to interpret observed patterns of student 
behavior or to provide in-depth commentary on survey results. 

Fourth and finally, the mindset required for sustaining such projects in 
the long term is one of continuous improvement. Those engaged in 
assessing and evaluating first-year-of-college programs should always 
bear in mind that no matter how good things are (or you think they are), 
they can always be improved. Finding the ways in which this can be 
accomplished is about details, not about “silver bullet” solutions that try 
to change everything at once. Real improvements take place by identi- 
fying and addressing individual classes of problems occurring for partic- 
ular types of students all over the place. The mindset that such 
improvement is a collective responsibility in pursuit of a common goal — 
student success in the first year of college — is critical to this process, as 
is a common store of usable information. Hopefully, this Toolkit will be 
of help in creating or strengthening this resource. 
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GLOSSARY 



Anonymity (provision for): “Evaluator action to ensure that the identity 
of subjects cannot be ascertained during the course of the study, in study 
reports, or in any other way (Joint Committee on Standards for 
Educational Evaluation, 1994).” “Only when the sponsor cannot identify 
each person’s response, even momentarily, is it appropriate to promise 
that a response is anonymous (Dillman, 2000, p. 163).” 

Confidentiality; “Answers are confidential. This statement conveys an 
ethical commitment not to release results in a way that any individual’s 
responses can be identified as their own (Dillman, 2000, p. 163).” 

Data; “Material gathered during the course of an evaluation that serves 
as the basis for information, discussion, and inference (Joint Committee 
on Standards for Educational Evaluation, 1994).” 

Data Audit; The process of identifying data resources and uses wherever 
they may be within an institution and gathering them into a useable 
information system. 

Data Element; Single, individual piece of data such as “name” or 
“race.” 

Face Validity; “The extent to which an instrument looks as if it meas- 
ures what it is intended to measure (Nunnally, 1970).” “An instrument has 
face validity if decisionmakers and information users can look at the 
items and understand what is being measured (Patton, 1984).” “It is 
obvious, on the face of it, that the proposed procedure is the best way of 
measuring the phenomenon of interest (Rutman, 1984).” “Apparent 
validity, typically of test items or of tests; there can be skilled and 
unskilled Judgments of face validity. Highly skilled Judgments come pretty 
close to content validity, which does require systematic substantiation 
(Scriven, 1991).” 

Footprint Data; Data that is gathered from a student or faculty member 
in the normal course of interacting with a postsecondary institution — e.g., 
data gathered on an admissions form, or on a form to have access to 
library resources. 
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Goal: “A statement, usually general and abstract, of a desired state 
toward which a program is directed (Rossi and Freeman, 1993).” “An end 
that one strives to achieve (Joint Committee on Standards for Educational 
Evaluation, 1994).” 

Guerrilla Database: An unofficial database not normally known to the 
larger institution — e.g., database of student teacher experiences and mentors 
for Education students. 

Information: “Numerical and nonnumerical findings, renderings, or 
presentations — including facts, narratives, graphs, pictures, maps, displays, 
statistics, and oral reports — that help illuminate issues, answer questions, 
and increase knowledge and understanding of a program or other object 
(Joint Committee on Standards for Educational Evaluation, 1994).” 

Needs Assessment: “Systematic appraisal of the type, depth, and scope 
of a problem (Rossi and Freeman, 1 993).” “. . . is a process for discovering 
facts about the functions or dysfunctions of organisms or systems; it’s not 
an opinion survey or a wishing trip (Scriven, 1991).” 

Objectives: “Specific, operationalized statements detailing the desired 
accomplishments of a program (Rossi and Freeman, 1993)” “Something 
aimed at or striven for, more specific than a goal (Joint Committee on 
Standards for Educational Evaluation, 1994).” 

Official Data: Data reported to federal or state agencies that must be 
exactly replicable. 

Policy Significance: “The significance of an evaluation’s findings for 
policy and program development (as opposed to their statistical signifi- 
cance) (Rossi and Freeman, 1993).” 

Sensitivity Analysis: The systematic analysis of the influence of various 
input values on the output of a model. 

Snapshots: To freeze data from a transactional database by capturing 
it at one particular time. 

Stakeholders: “Individuals or groups who may affect or be affected by 
program evaluation (Joint Committee on Standards for Educational 
Evaluation, 1994).” 

Transactional Database: A live database used to conduct interactions 
between humans and electronic databases, e.g. registration system. 

Triangulation: “The use of multiple sources and methods to gather 
similar information (Joint Committee on Standards for Educational 
Evaluation, 1994).” 

Unit of Analysis: “The least divisible element on which measures are 
taken and analyzed (Joint Committee on Standards for Educational 
Evaluation, 1994).” 

Unofficial Data: Data that may not necessarily be replicable. 

Utility: “The extent to which an evaluation produces and disseminates 
reports that inform relevant audiences and have beneficial impact on their 
work (Joint Committee on Standards for Educational Evaluation, 1994). 
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FIGURE 1 

EXPECTATION EXERCISE 

From Regional State University 



RESULTS FROM 

THE ACADEMIC LEADERSHIP RETREAT 2001 



National Survey of Student Engagement Question: 

In your experience at your institution during the current school year, about how 
often have you done each of the following? 





Freshmen 


Senior 


Predicted 


Ideal 


Actual 


Predicted 


Ideal 


Actual 


a. Asked question in class or 
contributed to class discussions 


1.96 


3.36 


2.69 


2.81 


3.72 


3.32 


b. Made a class presentation 


1.62 


2.68 


2.20 


2.77 


3.46 


2.93 


c. Prepared two or more drafts of a 
paper or assignment before 
turning it in 


1.53 


3.24 


2.94 


2.27 


3.42 


2.61 


d. Worked on a paper or project that 
required integrating ideas or 
information from various sources 


1.95 


3.28 


3.22 


2.74 


3.61 


3.32 



“Predicted” were predicted by a faculty group prior to seeing actual results. 
“Ideal” were projected by a faculty group prior to seeing actual results. 
“Actual” are actual student results from that institution for 2001 . 



FIGURE 2 

Student Services for Online Learners Beyond the Administrative Core 

The purpose of using this “web” in the Data Audit and Analysis Toolkit is to illustrate the 
variety, breadth of and interactions among student services on a typical college campus. 

This figure is used by permission from the Western Cooperative for Educational 
Telecommunications Learning Anytime Anyplace Partnership project. The goal of that project is 
to design student services beyond the administrative core. To reach a common imderstanding 
about what was meant by student services for purposes of the project, the partners divided 
services needed by online learners into five clusters or suites: administrative core services, 
academic services, commimications services, personal services, and student communities 
services. 
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FIGURE 3 

Student Affairs Offices and the Types of Data They Might Keep 



Academic Services 

Academic Advising 

Academic Records/Grades 
Academic Support Office Use by Students 
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Admissions 
Assessment Data 
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Course Information 

Disability Information 

Documents 
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Institutional Research 

Intemships/Cooperative Education 

Learning Center Use 

Placement Data 

Prerequisite Information 

Registration 

Scholarship/Grants 

Service Learning 

Special Studies and Reports 

Student Information System 

Student Life Data 

Surveys 

Academic Counseling 

Academic Records/Grades 

Academic Support Office Use and Data 

Admissions 

Assessment Data 
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Course Information 
Disability Information 
Documents 
Field Placement 
Housing 

Institutional Research 

Intemships/Cooperative Education 

Learning Center Use 

Placement Data 

Prerequisite Information 

Registration 

Scholarship/Grants 

Service Learning 

Special Studies and Reports 

Student Information System 

Student Life Data 

Surveys 



Assessment and Testing 

Academic Records/Grades 

Academic Support Office Use and Data 

Admissions 

Assessment Data 

Course Information 

Documents 

Field Placement 

Institutional Research 

Intemships/Cooperative Education 

Learning Center Use 

Placement Data 

Prerequisite Information 

Registration 

Special Studies and Reports 
Student Information System 
Surveys 
Bookstore 

Course Information 
Documents 

Faculty/Personnel Information 
Course Syllabi and Textbook Use 
Developmental Education Services 
Academic Records/Grades 
Academic Support Office Use and Data 
Admissions 
Assessment Data 
Athletics 

Course Information 
Disability Information 
Documents 
Field Placement 
Housing 

Institutional Research 
Intemships/Cooperative Education 
Learning Center Use 
Placement Data 
Prerequisite Information 
Registration 

Special Studies and Reports 
Student Information System 
Surveys 



Disability Services 

Academic Records/Grades 

Academic Support Office Use and Data 

Admissions 

Assessment Data 

Athletics 

Course Information 
Disability Information 
Documents 
Field Placement 
Housing 

Institutional Research 
Intemships/Cooperative Education 
Learning Center Use 
Placement Data 
Registration 

Special Studies and Reports 
Student Information System 
Surveys 
Library 

Course Syllabi and Textbook Use 

Documents 

Library Use 

Special Studies and Reports 
Surveys 

Retention Services 

Academic Records/Grades 

Academic Support Office Use and Data 

Admissions 

Assessment Data 

Athletics 

Course Information 

Disability Information 

Documents 

Field Placement 

Institutional Research 

Intemships/Cooperative Education 

Learning Center Use 

Placement Data 

Prerequisite Information 

Registration 

Scholarship/Grants 

Service Learning 

Special Studies and Reports 

Student Information System 

Student Life Data 

Surveys 



Technical Support 

Academic Support Office Use and Data 

Course Information 

Disability Information 

Documents 

Institutional Research 

Registration 

Special Studies and Reports 
Student Information System 
Student Life Data 
Surveys 
Tutoring 

Academic Records/Grades 

Academic Support Office Use and Data 

Admissions 

Assessment Data 

Athletics 

Course Information 
Disability Information 
Documents 
Institutional Research 
Learning Center Use 
Placement Data 
Prerequisite Information 
Registration 

Special Studies and Reports 
Student Information System 
Student Life Data 
Surveys 

Administrative Core 

Admissions, Registration 
Academic Records/Grades 
Academic Support Office Use and Data 
Admissions 
Assessment Data 
Athletics 

Course Information 
Disability Information 
Documents 
Institutional Research 
Placement Data 
Registration 
Scholarship/Grants 
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Student Information System 
Student Life Data 
Surveys 
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Course/Program Catalog 
Academic Records/Grades 
Accreditation 
Admissions 
Course Evaluations 
Course Information 
Documents 

Facilities, particularly Classroom, 
Computer, and Laboratory Setup 
Faculty/Personnel Information 
Institutional Research 
Prerequisite Information 
Registration 

Special Studies and Reports 
Student Information System 
Surveys 

Financial Aid 

Business Affairs 
Financial Aid Information 
Institutional Research 
Scholarships/Grants 
Surveys 

Schedule of Classes 
Course Information 
Documents 

Facilities, particularly Classroom, 
Computer, and Laboratory Setup 
Faculty/Personnel Information 
Prerequisite Information 
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FIGURE 4 

Academic Affairs Units and the Types of Data They Might Keep 
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FIGURE 5 

Conceptual Model of Student Flow Process 
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FIGURE 6 

Example: Small Rural Community College 
Breakdown of Fall Enrollment by Types of Students 
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FIGURE 7 

General Layout of a Longitudinal Student Database File 
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APPENDIX A 

An Example of Common Core Files with Data Elements 



Data files that constitute a recommended common core of data for a data file to examine the first 
year of college include: 



Common Core 
Common Core 
Common Core 
Common Core 
Common Core 
Common Core 
Common Core 
Common Core 
Common Core 



Student Data Elements — Demographic 
Student Data Elements — Educational Background 
Student Data Elements — Enrollment Status 
Course/Section Data Elements 
Enrollment Data Elements [Course/Person] 

Admissions Data Elements 

Persormel Data Elements — Instructional/Non-Instructional Staff 
Finance Data Elements 
Physical Facilities Data Elements 



Individual data elements under the headings are listed on the following pages. Elements that 
may be useful but that are of secondary importance for most analyses are enclosed in square 
brackets. Recommended data elements for the common core are drawn from many places 
throughout the institution. The most common sources are listed under “Source” using the 
following acronyms: 



SIS = Student Information Management System (Registrar’s System) 

PPS = Payroll/Persormel System 

FIN = Financial System 

FINAID = Financial Aid System 

PHY = Physical Facilities System 



Type of data can be either: 

A=alpha 

N=numeral 
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[Place of Birth - Mother] SIS 

[Place of Birth - Father] SIS 
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Common Core Student Data Elements — Educational Background 

Element Source Length Type Comments 

ETS Code of High School (or equivalent) SIS 4 N 

Date of High School Graduation SIS 6 N 
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Common Core Student Data Elements — Educational Background 

Element Source Length Type Comments 

MAT Most Recent Total Score SIS 2 N 
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i-GO 



WAT Most Recent Total Score SIS 

WAT Most Recent 1 Reader’s ESL SIS 



Common Core Student Data Elements — Educational Background 

Element Source Length Type Comments 

WAT Most Recent 2"'^ Reader’s ESL SIS 1 A 

WAT Most Recent 3^'^ Reader’s ESL SIS 1 A 

WAT Most Recent Final ESL Indicator SIS 1 A 
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Common Core Student Data Elements — Educational Background 

Element Source Length Type Comments 



z z < < 



CD (N CO 00 



CO CO CO CO 
CO CO CO CO 



0) 

D) 

"5 

o 

“w 

d ^ 

O 2 

0) O 

O (1) 

O H- 
</) 

liJ I- 



0) 

D) 

(0 

L_ 

0) 

> 

< 

c 

O 

0- 



0) 

*0 0) 
1 - (0 
O Q 

i- L_ 

</) </) 
C C 
(0 (0 




80 






i2 

c 

0) 

E 

E 

o 

O 



a 

> > ra ^ 

o t 5 

CO 03 ^ ^ 

•*-> 4-^ ^ > 

Ll Ll O Z 



!2 !2 !2 !2 !2 

<<<<< 0 ) Q) <D 

______ Q) d) d) 

o o o o o 

ccccc=== 

(OCOCOCOCOCOCOCO 



c c c c c ^ 



c c 



Q Q Q Q Q ” ” 

mmmmmcococo 

l-l-l-l-l-XXX 



0) 

a 

I- 



ZZ<ZZZZZ<ZZ<<ZZZZ<<<<ZZ<< 



O) 

c 

0) 






o> 



(0 

S 8 

(O c 
^ 3 

c o 

0) (O 



o o o o o 

COCOCOCOCOCOCOCOCOCOCOCOCOCOCOCOCOZZZZZCOCOCO 



c 
UJ 

I 

i2 
c 
0) 

E 

0) 

LU 

5 
(0 
Q 

'4-» 

c 

o 

"O 
3 
'<-» 

(O 

£ 

o 

0 

1 = 

II 

o » 

O UJ 



£ c 

*" 0) 






O P 



LU 



LU 



0) 

o 

c 

03 

“O 

c 

0) 

c 

0) 

(/) 

t/) 2 

3 0- 

(0 H— 

!o° 

§1 

■2 w 



]0 

< 



^ E c _ 

- j«S 2 a 5 LiJoi- 

= rr il •- ra ® C 

fc g) = 



tOJw.-.^ — 

O 0 ) <D "O ir o 3 

li- >- I- < o o o 



a. 

g 

c 

o 



'5 ^ 
(A .S2 

3 E 

(0 0) 

W "S 

o< 

0) c 
■g 0) 
(0 t 

0> 3 

a o 



(/) 

D 

ro 
•*— * 

CO 

0) 

E 



(/) 

3 

CD 
CO 

0) 

0) 

*“ *c 

qS: 
^ 0) 

SI 

"O ‘ 



03 

g 

*T3 

c 



5 3 
CO 



2 2 

f -5 S 

-D I- ^ 

s 



= s CT 



ra .ii' Q. 

o p bi 



(/) 

E 

03 

u. 

O) 

o 

zzi 

0 ) 0 ) 75 
> > - 

*03 * 03 2 

s. 

CC(o 

O O 
d) d) ^ 

a 

>^o 



(/) 

0) 

o 

k_ 

O (/) 

a “o 

“O § 

8 £ 

t H— 

< ° 
c 8 

— i- 

D) 3 

c o 

•^w 

d) ^ 
CO < 

^.2 
c o 
p c 

12 
O ll. 



“O 

0) 

0) 



0 ) d) 

a > 

5 k d) 



“O 

0) 

2 

c 

UJ 



d) 

“O 

o 

o 

E 

(0 

8p 

^ E 

g)i 

d) 



“2 “2 

< < 



772 ^ 



mg 

^•1 

■S ■£ 
(0 (0 

II 



o 

o 

■D 

0> 



O) 

o 



■D 

0> 



d) 

<D 



d) 

(/> 



d) 

4-i* 

_ g 

0 ) 0 ) 0 ) 
d) d) d) 
0 0 ^ 0 ^ 



o 

ERIC 



81 



163 



0) 0) 0) 
0) 0) Q) 



(T} (TS (TS 

C C C 



(0 

0) 

E 

CO 



B 

c 

E o o o 

E o o o 

E ... CL 
O CO CO CO 0> 

o x: x: X 0^ 



OJ 

0) 



o- 

■D 

B 

t) 

0) 

o 

o 



o 

a- 

>%- 



< < 



< < 



c ^ 
o 






CD 



CO 



CM CM CM CO 



t-LOCMC^LOCOCOCM'«- 



(0 

3 



S g 

</) i: ^ ^ 0) w _ 

’g g CO CO CO CO 

0) (/) 

E 



W W W W W CO 
CO CO CO CO CO CO 



CO CO CO CO 
CO CO CO CO 



CO CO CO CO CO CO 
CO CO CO CO CO CO 



c 

LU 



i2 
c 
0 

E 

0) 

LU 

s 
(0 
Q 
♦-» 
c 
0 
"O 
3 

4-* 

</) 

0 
k. 

o 

0 

1 = 

II 

o B 

O LU 



o 

c 



0 
"D 

o 

o 

E 

0 

D) 

o 

k_ 

CL 

0 
O) 

^ 0 
o = 



"D 

0 



c 

LU 



> "D 

o E 

LL 0 
^ w 



0 



E 

ro 

i— 

D) 

o 



CL 0 



g<s 
•IS 

cc ^ 



(/) 

go 

1 w 

■§ § 
C h- 



412 0 



"D 

0 

1_ 

o 



■a 

E 

o 



0 ro 
o 



0 

UJ 

0 



<■0 0 
CL 2 LU 

O O jo 
0 0^ 
0 

^ ^ o 

3 3 

E E E 

3 3 3 

o o o 



"O 

0 

Q. 

E 

0 



"O "O < 
0 0 
c c 



jO 

0 0^ 
LU UJ 2 
0 0 o 
^ ^ 0 

6 6 
. . 3 

E E E 

3 3 3 

o o o 



"O 

0 

Q.'O 



E 
0 c: 
0 

< UJ 
0 0 



T3 

0 



o o 



"O 

0 

Q- 

E 

0 

< 

0 

L_ 

3 

o 

X 



0 

E 

k_ 

0 



c c c c 

0 0 0 0 
L_ 1_ L_ L— 

3 3 3 3 

o o o o 



o 

4 -J 
0 
o 

0 

- E 

CO ^ 



"O 

c 



o -n* 0 eg 

T-1 0 0 



o 0 — 



< 

CL 

0 ^ 

oT'o ^ 

Q O 



0 o 
c 0 
0 > 
0 “O 

Q < 



.91 0 

O -o 
0 0 
0 "O 

D) 0) 

Q £ 



■O TO F 

:i -5 3 
u. I- o 

0 0 0 
0 0 0 
Q. Q. Q. 
Q- Q- Q- 
< < < 



er|c 



164 

82 



[Appeal Reason] SIS 

[Appeal Granted Indicator] SIS 



Common Core Student Data Elements — Enrollment Status 

Element Source Length Type Comments 
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Common Core Course/Section Data Elements 

Element Source Length Type Comments 
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# Credits Enrolled Same Dept. [calculated] 

# Credits Enrolled Same College/School [calculated] 



Common Course Enrollment Data Elements [Course/Person] 

Element Source Length Type Comments 

Student ID Number SIS 9 A Key link 
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Incomplete Flag [calculated] 1 A Was or is incomplete 

Date Completed [calculated] 1 A 

Withdrawal Flag [calculated] 1 A Any withdrawal 

No Credit Flag SIS 1 A Not taken for credit 

Course Cancelled Flag SIS 1 A 



Common Core Admissions Data Elements 

Element Source Length Type Comments 

Student ID Number SIS 9 A Key link 

[Full Name] SIS 39 A 

[Last Name] SIS 20 A 
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Common Core Admissions Data Elements 

Element Source Length Type Comments 

Admit Status SIS 1 A 

Admit Type SIS 1 A 

Application Date SIS 8 N 
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Common Core Personnel Data Elements — Instructional/Non-lnstructional Staff 

Element Source Length Type Comments 

Appointment Department Code PPS 4 N Recode to match 

Appointment Department Name PPS 30 A Recode to match 
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Common Core Personnel Data Elements— Instructlonal/Non-lnstructional Staff 

Element Source Length Type Comments 
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Common Core Finance Data Elements 

Element Source Length Type Comments 

Account Year FIN 4 A Recode to match 
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APPENDIX B 

Report Forms and Definitions 



Reports produced by a longitudinal tracking system are intended to provide timely and accessible 
information to decisionmakers at the institution. Standard summary reports are provided to 
document overall patterns of student persistence, performance, and behavior. All standard 
reports are constructed for a given tracking cohort and can be run to ascertain the status and 
performance of members of that cohort as of any designated subsequent term in the tracking 
period. All reports can be run for the cohort as a whole or for any subset of the cohort as defined 
by an available tracking system data element. For example, the entire report package could be 
run only for students in a certain ethnic category or program of study. The system should also be 
designed so that an investigator can probe the dataset at any time using additional statistical 
procedures through locally available statistical packages (SPSS, SAS, or Access). 

Standard reports are of several different types, described as follows: 

• Progression and Status, or Overall Student Flow Reports document or summarize the 
number of students and the percentage of a given cohort still enrolled as of a given 
subsequent term, the number completing degree programs, and other longitudinal 
progression information. 

• Overall Academic Performance Reports document the achievements to date of members 
of the cohort in terms of hours attempted and completed, grade point averages, course 
completion rates, average loads, and similar indicators. 

• Coursework Status Reports including Patterns of Experience document the performance 
and progress of students in particular, identified courses in core skill areas. 

• Coursework Placement/Effectiveness Reports document the effectiveness of basic skills 
placement policies by examining student performance in later coursework for students 
initially placed at various skill levels and completing various later courses. 

Report formats and associated descriptions and definitions for each standard report are provided 
below; 

TERM-TO-TERM PROGRESSION REPORT 

This report provides summary term-to-term information on student progress. It presents both the 
absolute number of students and the percentage of the beginning cohort persisting and 
completing for each elapsed term in the tracking period. Separate versions of the report can be 
produced for a) first-time college students, and b) new transfer students. Column headers consist 
of a longitudinal series of terms for which these summary performance indicators can be 
calculated. Row variables consist of a standard set of demographic and educational background 
groupings. A “Program-Level” version of this report can also be run with row variables 
corresponding to initial program codes, grouped by catalog length of program (e.g., One-Year 
Certificate, Two-Year AAS, and Transfer Programs). 



All percentages in this report are calculated on the basis of their associated row totals. Note that 
all completion percentages in this report are cumulative; that is, any entry includes all those 
students who had completed a degree by the end of the indicated term and all previous terms 
included in the tracking period. 

Variables used in this report (page 105) can be found in tables in Appendix A and are further 
defined as follows. 

Column Variables : 

NUMBER OF STUDENTS: The total number of students in the cohort who are members of the 
demographic groups described by the row labels. These also correspond to Term 1 enrollments. 

TERM 2/TERM N: Includes (1) the number of students in the cohort who are actively enrolled 
in the institution during each elapsed term as indicated by the Total Credit Hours Attempted data 
element, and (2) the cumulative number who have completed a degree or certificate as of the 
term indicated by the first Type of Degree or Certificate Awarded data element. “Term 4” thus 
includes an entry for all students actively enrolled as of the fourth term after the cohort’s first 
term of academic history, and an entry for those who had completed degrees or certificates up to 
and including the fourth term. This report is produced in a “count” version giving absolute 
numbers and a “percentage” version giving the proportion of cohort starters in each group 
persisting and completing. Only the first four terms of the total term tracking period are 
illustrated in the example. 

Row Variables : 

GENDER: Male and Female categories as indicated by the Gender data element. 

AGE: Age categories as indicated. Age is calculated from the YY digits of the Date of Birth 
data element and calculated from the beginning of the tracking period. 

RACE AND ETHNICITY : Race and Ethnic categories as indicated; recoded from the Race and 
Ethnic data elements. 

CITIZENSHIP: Categories as indicated by the Citizenship data element 

RESIDENCE AT ENTRY: Categories of in-district and out-of-district. 

PHYSICALLY DISABLED: Row includes only those students indicating a handicap as 
contained in one or more Impairment flags. 

DISADVANTAGE: Row contains only those students with disadvantaged status as defined and 
calculated by institution. 

LIMITED ENGLISH: Row contains only those students indicating limited English-speaking 
ability as defined and calculated by institution. 

HIGHEST DEGREE ATTAINED: Categories as shown in the table (page 106); recoded from 
the Highest Degree Previously Attained data element. 
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ENTERING STATUS; Categories as shown in the table (page 106); assigned and recoded on 
the basis of an entry in one or more of the Prior College data elements. 

DAY/EVENING INDICATOR; Categories as calculated based on institutional definitions for 
the first term of the tracking period. 

HIGH SCHOOL CONCURRENT FLAG; Row includes only those students that began their 
study at the college as concurrent enrollment students as shown in the High School Concurrent 
Flag data element. 

DATE OF HIGH SCHOOL GRADUATION; Categories as indicated calculated on the basis of 
the begirming of the tracking period. 

INITIAL LOCATION OF ATTENDANCE; Categories as derived from Course Section location 
data elements for the first term of the tracking period. 

INITIAL PROGRAM; Categories as shown; recoded from the Registered College Program data 
element for the first term of the tracking period. 

INITIAL OBJECTIVE; Categories as contained in the Degree Objective data element for the 
first term of the tracking period. 

INTENDED PERSISTENCE; Categories as contained in the Intended Persistence data element 
for the first term of the tracking period. 

INITIAL READING PROFICIENCY ; Categories as indicated in Educational Background 
database data elements for the first term of the tracking period. 

INITIAL WRITING PROFICIENCY; Categories as indicated in Educational Background 
database data elements for the first term of the tracking period. 

INITIAL MATH PROFICIENCY; Categories as indicated in Educational Background database 
data elements for the first term of the tracking period. 

PROGRAM-LEVEL STATISTICS; Categories as shown, grouped into categories 
corresponding to the catalog length of the program; these are initial program declarations as 
indicated by the Registered College Program data element for the first term of the tracking 
period. 

SUMMARY PROGRESS REPORTS 

These reports document in greater detail the extent to which particular student populations 
persist and complete degrees. They are “snapshot” reports, reflecting the status of a given cohort 
as of a given term in the tracking period, and can be run for any term. Their format presents a 
number of persistence-related indicators as column variables and a range of subpopulation 
descriptors as row variables. Each column is intended to provide a somewhat different indicator 
of cohort status and is defined independently; note that the categories represented are not 
necessarily mutually exclusive and consequently will sum to more than 1 00% of the cohort. The 
report is produced in two forms; the “count” version shows the absolute number of students in 



each category, and the “percentage” version shows the corresponding percentage of each row 
total. In parallel with the Term-to-Term Progression Report, a Program-Level version of the 
report is also produced. 

Column and row variables for these reports are defined below. 

Column Variables : 

NUMBER OF STUDENTS: The number of students in the cohort who are members of the 
subpopulations described by the row variables. For the percentage report, this number is 
repeated to serve as an indicator of cell size in terms of which to judge the significance of 
supplied percentages. 

ENROLLED: The number of students in the cohort who are officially enrolled during the term 
for which the report is run. A student is counted as “enrolled” if a greater-than-zero entry is 
present for the Cumulative Credits Attempted data element for the term for which the report is 
run. 



NOT ENROLLED: The number of students who are not enrolled by the above definition for the 
term for which the report is run. 

SUSPENDED/DISMISSED: The number of students who are noted as academically dismissed 
or continuing dismissed in the Current Academic Status data element for any term up to and 
including the term for which the report is run. 

NOT PERSISTING: The number of students who have not officially enrolled according to the 
above definition for two consecutive prior terms (excluding summer terms) and have not 
graduated. Note that the classification of a student as “not persisting” in this report is provisional 
and may change on the basis of subsequent behavior in later terms. 

FIRST TERM ONLY: The number of students who officially enrolled in their first term (the 
cohort’s first term of academic history), but who have not enrolled according to the above 
definition, and who have not graduated, in any subsequent term up to and including the term for 
which the report is run. Note that if the report is generated for the first term of the cohort, these 
entries should correspond to the “Number of Students” column. 

COMPLETERS: The number of students who have earned a degree or certificate as indicated by 
any one of the Type of Degree/Certificate Awarded data elements for any term up to and 
including the term for which the report is run. Note that the same student may be present in both 
this category and in the “Enrolled” category if the student has re-enrolled after completing a 
degree or certificate. 

RE-ENROLLED AFTER COMPLETION: The number of students who have earned a degree or 
certificate as defined in the “Completers” column and are also currently enrolled according to the 
above definitions. 





Row Variables: 



The row variables, definitions, and labels used in this report are identical to those used in the 
Term-to-Term Progression Report. 

SUMMARY PERFORMANCE REPORT 

This report presents summary statistics that describe the enrollment behavior of particular - 
student subpopulations as they progress. The layout of the report is similar to that of the 
Summary Progress Report described previously. Like that report, it is a “snapshot” reflecting the 
status of the cohort as of a particular designated term in the tracking period. Performance 
indicators are arrayed as column headers and subpopulation breakdowns are incorporated as row 
variables. A “Program-Level” version of the report can also be created. Variables included in 
this report are defined as follows. 

Column Variables : 

NUMBER OF STUDENTS: The number of students in the cohort who are members of the 
demographic groups described by the row labels. These totals are the same as used in previous 
reports. 

TOTAL CREDITS ATTEMPTED: The total number of student credit hours attempted up to and 
. including the term for which the report is run. Based on the total of the Cumulative Credits 
Attempted data elements across all past terms. The statistic presented is a group average for 
each designated population. 

CREDITS EARNED: The total number of credits earned up to and including the term for which 
the report is run. Based on the total of the Cumulative Credits Earned data elements across all 
past terms. As above, this statistic is presented as a group average for the designated population. 

AVERAGE LOAD (EXCLUDING SUMMER): The average number of student credit hours 
attempted as defined above for each term in which the student was officially enrolled, up to and 
including the term for which the report is run, but excluding any summer terms. Each student’s 
average load is first calculated across all terms (excluding summers) in which the student was 
enrolled; this statistic is then averaged across all members of the designated population. 

CREDITS EARNED RATIO: The total number of student credit hours successfully completed 
by each student up to and including the term for which the report is run, divided by the total 
number of student credit hours attempted over the same period, both as defined previously. The 
ratio is 1 .00 for a student who has successfully completed all courses. The completion ratio is 
calculated first for each student based on actual enrollments and completions. Then an average 
is prepared for each designated subpopulation. 

CUMULATIVE GPA: The cumulative official overall grade point average as of the term for 
which the report was run, as indicated by the Cumulative Grade Point Average data element. 

The statistic presented is a group average for the designated population. 
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PERCENT EARNING DEGREE/CERTIFICATE: The percentage of students in each 
designated subpopulation who have successfully completed a degree or certificate as indicated 
by any entry in one of the four Type of Degree/Certificate Awarded data elements, for all terms 
up to and including the term for which the report is run. This entry is identical to the 
“Completed” column in the Summary Progress Report. 

NUMBER OF ENROLLED TERMS TO COMPLETE: The number of terms in which students 
who completed a degree or certificate were officially enrolled up to and including the term in 
which a degree or certificate was awarded. The statistic presented is a group average for the 
designated subpopulation and includes only those who have completed a degree or certificate. 

NUMBER OF ELAPSED TERMS TO COMPLETE: The number of terms elapsed since the 
beginning of the tracking period up to and including the term in which a degree or certificate was 
awarded. The statistic presented is a group average as above, and includes only those who have 
completed a degree or certificate. 

Row Variables : 

The row variables, definitions, and labels employed in this report are identical to those used in 
the previous two reports. 

COURSEWORK STATUS REPORTS 

The purpose of these reports is to track the performance of students of various kinds with respect 
to enrollment and performance in a range of developmental and common core sequence courses 
(English and Math). In the examples included here, the English/Speech Coursework Status 
Report tracks progress in English 001, English 002, English 003, and Speech 101 . The Reading 
Coursework Status Report tracks progress in Reading 001, Reading 002, and Reading 003. The 
Initial Math Coursework Status Report tracks progress in Math 001, Math 002, and Math 003. 
The Later Math Coursework Status Report tracks progress in Math 005, Math 006, Math 007, 
and Math 008. The Business Coursework Status Report tracks progress in Business 100, 
Business 101, Accounting 101, and Secretarial Science 101. The Social Sciences Coursework 
Status Report tracks progress in Psychology 100 and Sociology 100. The Sciences Coursework 
Status Report tracks progress in Biology 100, Chemistry 100, and Chemistry 101. Like the 
Summary Progress and Summary Performance Reports described previously, these are 
“snapshot” reports, reflecting the status of the cohort with respect to these courses as of any 
designated term in the tracking period. Run successively for each term, they can be used to track 
the sequence and timing of taking these courses for different student populations. 

Formats for all seven reports are similar and their column and row variables are defined below. 

Column Variables : 

NUMBER OF STUDENTS: The number of students in the cohort who are members of the 
demographic groups described by the row labels. These totals are the same as used in previous 
reports. Course-related column variables for all five reports are driven by “course specific” term 
data elements associated with each course. For each course, the following statistics are provided: 



O 

ERIC 



102 



183 



PERCENT ENROLLING: The cumulative percentage of the starting cohort (or of each 
designated subpopulation) attempting the course in any term up to and including the term for 
which the report is run. A student is counted as “attempting” the course if any grade designation 
appears in the appropriate “Course Performance” data element in the current term or in any 
previous term in the tracking period. 

PERCENT RETAKING: The cumulative percentage of the starting cohort (or of each 
designated subpopulation) who enrolled for the course more than once up to and including the 
term for which the report is run. A student is counted in this category if more than one grade 
entry is detected in the appropriate “Course Performance” data elements in the current term or in 
any previous term in the tracking period. 

PERCENT COMPLETING: The cumulative percentage of the starting cohort (or of each 
designated subpopulation) who received credit for the course in any term up to and including the 
term for which the report is run. A student is counted as having ‘completed” the course if a 
passing grade is recorded in the appropriate “Course Performance” data element in the current 
term or in any previous term in the tracking period. 

AVERAGE GRADE: The average grade earned by members of the cohort (or by each 
designated subpopulation) who enrolled for the course. Grades are averaged on a 0.0 to 4.0 scale 
for each designated population. 

Row Variables : 

Row variables for this report are identical to those used in the Summary Progress and 
Performance Reports. 

COURSEWORK PLACEMENT/EFFECTIVENESS REPORTS 

The purpose of these reports is to help evaluate the effectiveness of initial placements in 
Reading, Writing, and Math in the light of performance in subsequent coursework. Like the 
previous reports, these are “snapshot” reports, and can be produced to reflect the status of the 
cohort as of any term in the tracking period. Column variables consist of course-specific 
performance statistics similar to those used in the Coursework Status Reports described above, 
together with some additional statistics. Row variables consist of initial placement levels in 
Reading, Writing, and Math. 

Column Variables : 

NUMBER OF STUDENTS: The number of students in the cohort who are members of the 
demographic groups described by the row labels. These totals are the same as used in previous 
reports. 

PERCENT NOW PROFICIENT IN READING, WRITING, OR MATH: The number of 
students who are designated as proficient at the indicated course level in reading, writing, or 
math as appropriate, and as determined by data elements in the Educational Background file for 
the term in which the report is run. 



“COURSE PERFORMANCE” STATISTICS: These are identical to the columns previously 
defined for the various Coursework Status Reports (College-Level) described previously, except 
that only the “Percent Enrolling” and “Average Grade” statistics are presented for each course. 

Row Variables : 

Row variables consist of placement levels on entry as indicated by the Initial Reading Placement 
Level, Initial Writing Placement Level, or Initial Math Placement Level data elements. 

HIGH SCHOOL FEEDBACK REPORT 

The purpose of this report is to provide a set of summary performance statistics broken down by 
individual feeder high schools, to help inform articulation and recruitment arrangements with 
these schools. Column variables consist of statistics similar to those contained in the Cohort 
Status and Cohort Performance Reports described above, plus some additional performance 
statistics. Row variables consist of students from each identified high school, broken down 
further on the basis of the number of years elapsed since high school graduation. 

Column Variables : 

The first four columns, “Number of Students, Percent Enrolled, Percent Completed, and Percent 
First Term Only” contain statistics identical to those of the same name presented in the Summary 
Progress Report. Similarly, the last two columns contain statistics identical to the “Cumulative 
GPA” and “Total Credits Earned Ratio” columns of the Summary Performance Report. 
Additional columns are defined as follows: 

ENGLISH AND MATH PLACEMENT LEVELS: For the example here, assignments to 
“College” and “Below College” are made on the basis of the Writing and Math Initial Placement 
Level data elements as follows. For “English,” codes 1 and 2 of the Initial Writing Placement 
Level data element are assigned to “Below College”; for Math, code 1 of the Initial Math 
Placement Level data element is assigned to “Below College.” 

Row Variables : 

Row variables consist of students from each identified high school, broken down further on the 
basis of the number of years elapsed since high school graduation as indicated by the Date of 
High School Graduation data element. 
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Term to Term Progression Report (Counts and Percentages) 
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Term to Term Progression Report (Counts and Percentages) 



O) 

C 

o 

a 

E 

o 

o 



■o 

0) 



c 

LU 



D) 

O 

a. 

E 

o 

o 



■o 

0) 



c 

LU 



O) 

c 



a 

E 

o 

o 



■o 

0) 



c 

LU 



o w 

II 



O) 



*D 

0> 0> 

ro £ 8 
5 o)t 
< a> tr 

0) Q (D 

a> 0)0 

O) O) 0 ) 

S = -2 

*^0 0 

s 

9 ^ o (A 



S 0 ) 

S 0 ) 



O) 



z < 



o (D 
(D ^ 
CD O 



c 

0) 

*D 

3 

$ § 

3 1 --^ 

0)=t w 
c ^ ^ 

•C $ (0 

iiZH 

c 

LU 



O 

8 

TD 



C 

0) 

> 

LU 

(U 

Q 



T 3 
O) C 

•I " 

m S 8 
S9 > > 
Q LU S 



O 



u W 

o CO 



O 

O) 

< 

(/) 

(/) 

0) 



*5 

CO 

05 



g> o 



o 

0) 

(0 

Q 



< 



c c 
O O 

< CD 
c c 
o o 



c 

O 

O 

c 

o 



^ ^ ^ t_ (/) 



s s 

o o 



go ^ 



° < 



(0 
c 
.9 
^ os 

^ I- 

S 8 l 

HOO 



O 

ERIC 



106 



187 



Term to Term Progression Report (Counts and Percentages) 
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Summary Progress Report (Counts and Percentages) for XX Cohort as of XX Term 
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Summary Progress Report (Counts and Percentages) for XX Cohort as of XX Term 



■o 

S 

o 

Q. 

E 

o 

o 



O) 

c 

o « 

Z e 
0 ) 
Q. 



■o -o 
0) 0) 
"O (0 

c w 

0) c 

a E 

(0 (0 



■o 

0 ) 

“ 

-I 

UJ 



o j3 

‘ c 



0 ) 



0 ) 






c 

o 



■D 

0) 

c 

to 

0) 

E 

O) 

0) 

Q 

w 

0) 

g> 

i 



0) 

E I 

0)it= 
0) IT 
Q 0) 

0)^ 
O) 0) 

o o 
O o 

O (/) 

2 < 



0) 

2 

o> 

0) 

Q 



c 

0) 

T3 

3 

55^ 
$ 0) 
i/i £■§ 

CO .i= ^ 

c 5 C 
■C 5 CO 

B:zh 

c 

LU 



*o 

_c 

O) 

c 

'c 

(D 

TO 

(0 

Q 



O) 

is 

LL 

C 

0) 

w 

k_ 

3 

O 

c 

o 

o 



T3 
O) C 
(D 



JC 

(D 

0) 



T3 

(0 

L_ 

o 

o 

o 

o 

CO 



— ^ 1_ (/3 



5 i 



CO 



g> 

I 



o 

O) 

< 

(/) 

L_ 

(0 

< lO 

c 

to ^ 
(1) 

lO o 



c c c 
O O O 
< m o 

c c c 



8 o o o 



^ S S S 5 

o o o = 

-i — i — i < 



to 



E 

0> (D 

t. (/) 
Q_ C 
— to 
to ■ 



to 

c 

g 

V-< 

to 

Q. 

§1 



•Shoo 



ERIC 



109 



1^0 



Summary Progress Report (Counts and Percentages) for XX Cohort as of XX Term 
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Overall Performance Report for XX Cohort as of XX Term 
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Overall Performance Report for XX Cohort as of XX Term 
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